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Date: Express Mail Label No. £12401 315**305 



Inventors: John T. Gray and Richard C. Mulligan 

Attorney's Docket No. : CMCC-693p2A 

PACKAGING CELL LINES 

RELATED APPLICATIONS 

This application claims the benefit of U.S. Provisional Application No. 
60/100,063, filed September 12, 1998 and U.S. Provisional Application No. 60/100,022, 
5 filed September 1 1, 1998, the contents of which are incorporated herein by reference in 
their entirety. 

BACKGROUND OF THE INVENTION 

Retroviral vectors based on lentiviruses, such as human immunodeficiency 
viruses (HIV), can infect nondividing cells, and integration of proviral DNA occurs 
10 without the need for cell division. These properties make lentiviruses attractive for gene 
transfer into nondividing cells, such as hepatocytes, myofibers, hematopoietic stem 
cells, and neurons. 

However, the use of lentivirus vectors, particularly HIV vectors, particularly for 
gene therapy, is hampered by concern over their safety. Thus, a need for the 
1 5 development of lentivirus vectors, particularly HIV vectors, with improved safety, 
particularly for gene therapy, exists. 



SUMMARY OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
viral accessory protein independent lentivirus-derived, particularly HIV-derived, 
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retroviral vector particles, to construction of such cell lines and to methods of using the 
accessory protein independent lentivirus-derived retroviral vector particles to introduce 
DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 
mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a 
5 preferred embodiment, the packaging cell lines of the present invention are stable 
packaging cell lines. 

In one embodiment of the invention, packaging cell lines for producing a viral 
accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
10 comprises a coding sequence for lentivirus gagpol, wherein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins. 

In second embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles 

15 comprise (a) a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in 
the cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a second retroviral nucleotide sequence in the cell 
which comprises the coding sequence for a heterologous envelope protein. 

20 In a third embodiment of the invention, packaging cell lines for producing a viral 

accessory protein independent lentivirus-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for lentivirus gagpol, wherein said coding sequence has 
been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 

25 proteins; (c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and (d) a third retroviral 
nucleotide sequence which comprises a DNA sequence of interest and lentivirus cis- 
acting sequences required for packaging, reverse transcription and integration. 
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In a fourth embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent lentivirus-derived retroviral vector particles 
comprise (a) a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the 
cell which comprises a coding sequence for lentivirus gagpol, wherein said coding 
5 sequence has been codon optimized by mutagenisis to improve expression of the 
lentivirus gagpol proteins; and (c) a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration. 

In a fifth embodiment of the invention, packaging cell lines for producing a viral 

10 accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
(e.g., mammalian cell); and (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins. 

In sixth embodiment of the invention, packaging cell lines for producing a viral 

1 5 accessory protein independent HIV-derived retroviral vector particles comprise (a) a cell 
(e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 
(c) a second retroviral nucleotide sequence in the cell which comprises the coding 

20 sequence for a heterologous envelope protein. 

In a seventh embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
a cell (e.g., mammalian cell); (b) a first retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 

25 codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; (c) 
a second retroviral nucleotide sequence in the cell which comprises the coding sequence 
for a heterologous envelope protein; and (d) a third retroviral nucleotide sequence which 
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comprises a DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

In a eighth embodiment of the invention, packaging cell lines for producing a 
viral accessory protein independent HIV-derived retroviral vector particles comprise (a) 
5 a cell (e.g., mammalian cell); (b) a retroviral nucleotide sequence in the cell which 
comprises a coding sequence for HIV gagpol, wherein said coding sequence has been 
codon optimized by mutagenisis to improve expression of the HIV gagpol proteins; and 
(c) a retroviral nucleotide sequence which comprises a DNA sequence of interest and 
HIV cis-acting sequences required for packaging, reverse transcription and integration. 
10 Alternatively, each of the packaging cell lines described herein can be produced 

using (1) a retroviral nucleotide sequence which comprises a codon optimized gag 
coding sequence and (2) a retroviral nucleotide sequence which comprises a codon 
optimized pol coding sequence, in place of the retroviral nucleotide sequence which 
2 comprises a codon optimized gagpol coding sequence. 

- " 15 In a particular embodiment, the heterologous envelope protein is the G 

□ glycoprotein of vesicular stomatitis virus (VSV G). In another embodiment, the 

17 heterologous envelope protein is the amphotropic envelope of the Moloney leukemia 

% virus (MLV). 

0 Cell lines for producing a viral accessory protein independent lentivirus-derived 

20 retroviral vector particles are produced by transfecting host cells (e.g., mammalian host 
cells) with a plasmid comprising a DNA sequence which encodes lentivirus gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the lentivirus gagpol proteins. Depending upon the particular 
cell line being produced, the host cells are also co-transfected with a plasmid 
25 comprising a DNA sequence which encodes a heterologous envelope protein, or a 
plasmid comprising a DNA sequence of interest and lentivirus cis-acting sequences 
required for packaging, reverse transcription and integration, or both of these plasmids. 
Alternatively, host cells are transfected with a plasmid comprising a codon optimized 
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DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the plasmid 
comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 
Cell lines for producing a viral accessory protein independent HIV-derived 
5 retroviral vector particles are produced by co-transfecting host cells (e.g., mammalian 
host cells) with a plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been codon optimized by mutagenisis to 
improve expression of the HIV gagpol proteins. Depending upon the particular cell line 
being produced, the host cells are also co-transfected with a plasmid comprising a DNA 

1 0 sequence which encodes a heterologous envelope protein, or a plasmid comprising a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
transcription and integration, or both of these plasmids. Alternatively, host cells are 
transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
HIV gag protein and a plasmid comprising a codon optimized DNA sequence encoding 

15 a HIV pol protein, in place of the plasmid comprising a codon optimized DNA sequence 
encoding both HIV gagpol proteins. 

The present invention also relates to methods of producing viral accessory 
protein independent lentivirus-derived retroviral vector particles, comprising co- 
transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 

20 DNA sequence which encodes lentivirus gagpol proteins, wherein said DNA sequence 
has been codon optimized by mutagenisis to improve expression of the lentivirus gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and lentivirus cis-acting sequences required for packaging, reverse transcription 

25 and integration. Alternatively, host cells are transfected with a plasmid comprising a 
codon optimized DNA sequence encoding a lentivirus gag protein and a plasmid 
comprising a codon optimized DNA sequence encoding a lentivirus pol protein, in place 
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of the first plasmid comprising a codon optimized DNA sequence encoding both 
lentivirus gagpol proteins. 

In a particular embodiment, the invention relates to methods of producing viral 
accessory protein independent HIV-derived retroviral vector particles, comprising co- 
5 transfecting host cells (e.g., mammalian host cells) with (a) a first plasmid comprising a 
DNA sequence which encodes HIV gagpol proteins, wherein said DNA sequence has 
been codon optimized by mutagenisis to improve expression of the HIV gagpol 
proteins; (b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
1 0 interest and HIV cis-acting sequences required for packaging, reverse transcription and 

7? integration. Alternatively, host cells are transfected with a plasmid comprising a codon 

UJ optimized DNA sequence encoding a HIV gag protein and a plasmid comprising a 

codon optimized DNA sequence encoding a HIV pol protein, in place of the first 

st plasmid comprising a codon optimized DNA sequence encoding both HIV gagpol 

y 1 5 proteins. 

ill The present invention also relates to viral accessory protein-independent 

y: retroviral particles produced by or obtainable by (obtained by) the methods described 

herein. 

|l The present invention further relates to isolated DNA encoding a codon 

20 optimized lentivirus gagpol, isolated DNA encoding the gag coding region of a codon 
optimized lentivirus gagpol, and isolated DNA encoding the pol coding region of a 
codon optimized lentivirus gagpoh In a particular embodiment, the present invention 
relates to isolated DNA encoding a codon optimized HIV gagpol, isolated DNA 
encoding the gag coding region of a codon optimized HIV gagpol, and isolated DNA 
25 encoding the pol coding region of a codon optimized HIV gagpoL 

The packaging cell lines and viral particles of the present invention can be used 
for gene therapy or gene replacement with improved safety. The packaging cell lines 
and viral particles of the present invention can also be used in development and 
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production of vaccines, and in production of biochemical reagents. Gene therapy 
vectors produced with the cell lines of the present invention are expected to be valuable 
medical therapeutics. 

BRIEF DESCRIPTION OF THE DRAWINGS 
5 Figure 1 is a schematic diagram of an expression cassette containing the codon 

optimized gagpol genes. The DNA was constructed in multiple segments, which are 
indicated at the top as 1/3, 2/3, 3/3 (A, B, C and D) and HIN. Restriction sites used to 
assemble the cloned segments are indicated above the kilobasepair (Kb) ruler. Below 
the ruler are multiple features showing the location of the human cytomegalovirus 
10 (CMV) promoter, human betaglobin sequences (Bglobin), mRNA sequences (thinner 
line represents intronic sequence), the gag and pol open reading frames, the individual 
proteolytic fragment coding sequences (pl7JMA, p24_CA, p7, p6, PR, p51_RT, 
RNaseH and integrase (IN)) and each synthetic oligonucleotide used in the assembly 
process (multiple adjacent open arrows). 
15 Figure 2 is a table which depicts codon usage frequencies in genes which are 

highly expressed and in the codon optimized gagpol open reading frame of the HIV 
packaging construct described herein. 

Figure 3 is a schematic representation of the HIV provirus and a three-plasmid 
expression system used for generating a pseudotyped HIV-based vector by transient 
20 transfection as described in Naldini et al 9 Science, 272:263-267 (1996). 

Figure 4 is a list of some characteristics relating to the HIV Rev protein. 
Figure 5 is a list of some points relating to codon optimization of HIV gagpol. 
Figure 6 is a partial DNA sequence of HIV gag (SEQ ID NO: 1), showing 
inactivation of inhibitory sequences as described in Schwartz, S. et aL, J. Virol. , 
25 65(12):7176-7182(1992). 

Figure 7 a plot of the %(G+C) content of wildtype HIV gagpol sequences and 
theoretically codon optimized HIV gagpol sequences. The percent of bases, either G or 
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C, was calculated for a 30 nucleotide moving window for the entire length of the gagpol 
gene, and the value plotted versus nucleotide position. Diamonds = HIV gagpol 
sequences; squares = full optimal back-translation for gag open reading frame; 
triangles = full optimal back-translation for pol open reading frame; CO = codon 
5 optimized. 

Figures 8A-8E depict the alignment of the nucleotide sequences and predicted 
amino acid sequences for the gag coding region of a wildtype HIV gagpol and a codon 
optimized HIV gagpol. "NL4-3 genbank.SEQ" indicates the nucleotide sequence (SEQ 
ID NO: 2) and predicted amino acid sequence (SEQ ID NO:3) for the gag coding region 
10 of a wildtype HIV gagpol "pHDMHgpm2.seq M indicates the nucleotide sequence (SEQ 
ID NO: 4) and predicted amino acid sequence (SEQ ID NO: 5) for the gag coding region 
of a codon optimized HIV gagpol. The "NL4-3 genbank.SEQ" sequences are publicly 
: i| available at the NIH GenBank sequence repository (Accesssion No. M19921). 

2 Figures 9A-9L depict the alignment of the nucleotide sequences and predicted 
15 amino acid sequences for the pol coding region of a wildtype HIV gagpol and a codon 

3 optimized HIV gagpol "NL4-3 genbank.SEQ" indicates a nucleotide sequence (SEQ 
!I ID NO:6) and a predicted amino acid sequence (SEQ ID NO:7) for the pol coding 

=f region of a wildtype HIV gagpol available in the NIH GenBank sequence repository 

IJ (Accesssion No. M19921). The nucleotide and amino acid sequences for the pol coding 

20 region available in the GenBank sequence repository contain two sequence errors, 

which are indicated in Figures 9A-9L with shading. "pNL4-3.seq M indicates the correct 
nucleotide sequence (SEQ ID NO: 8) and predicted amino acid sequence (SEQ ID 
NO: 9) for the pol coding region of a wildtype HIV gagpol. M pHDMHgpm2.seq" 
indicates the nucleotide sequence (SEQ ID NO: 10) and predicted amino acid sequence 
25 (SEQ ID NO:l 1) for the pol coding region of a codon optimized HIV gagpol. 

Figures 10A-10D depict the DNA sequence (SEQ ID NO: 12) for pHDMHgpm2. 
The CMV enhancer/promoter is at nucleotides 97 to 679, human betaglobin sequences 
(Bglobin) are at nucleotides 761 to 864, 865 to 1303 and 5710 to 6469 (end of Bglobin 
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is at nucleotdes 6445 to 6469), mRNA sequences are at nucleotides 680 to 778 and 1255 
to 5921, SV40 origin of replication is at nucleotides 8796 to 8908, beta-lactamase (bla) 
coding region is at nucleotides 6709 to 7569, intron sequences are at nucleotides 779 to 
1254, the codon optimized gag coding region is at nucleotides 1318 to 2820, the codon 
5 optimized pol coding region is at nucleotides 2619 to 5624 and the poly A site is at 
nucleotides 5897 to 5921. 

Figure 1 1 is a circular map of plasmid pHDMHgpm2. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to novel packaging cell lines useful for generating 
J 10 viral accessory protein independent lentivirus-derived, particularly HIV-derived, 

^ retroviral vector particles, to construction of such cell lines and to methods of using the 

Ly; accessory protein independent lentivirus-derived retroviral vector particles to introduce 

: ^ DNA of interest into cells (e.g, eukaryotic cells such as animal (particularly 

mammalian), plant or yeast cells or prokaryotic cells such as bacterial cells). In a 
□ 15 particular embodiment, the packaging cell lines of the present invention are stable 

U packaging cell lines. 

: '% The cell lines are engineered to express the lentivirus proteins necessary for 

13 virus particle formation (gagpol proteins), without containing DNA sequences from 

lentivirus accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev response 
20 element (RRE)). Additionally, no viral sequences (such as cis-acting elements termed 
constitutive transport elements (CTEs)) will be expressed as RNA of any kind. DNA 
sequences for lentivirus gagpol are codon optimized by extensively mutagenizing the 
sequences to improve expression and to reduce the risk of recombination between 
transfer vector sequences and gagpol messenger RNA. This greatly improves the safety 
25 of virus preparations generated from these cell lines. In a particular embodiment, the 
DNA sequences for lentivirus gagpol are not codon optimized in the overlap region 
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between the gag and pol sequences and in cis-acting signals necessary for translation of 
pol. 

Examples of Antiviruses include human immunodeficiency viruses (e.g., HIV-1, 
HIV-2, HIV-3), bovine lentiviruses (e.g., bovine immunodeficiency viruses, bovine 
5 immunodeficiency-like viruses, Jembrana disease viruses), equine lentiviruses (e.g., 
equine infectious anemia viruses), feline lentiviruses (e.g., feline immunodeficiency 
viruses, panther lentiviruses, puma lentiviruses), ovine/caprine lentiviruses (e.g., 
Brazilian caprine lentiviruses, caprine arthritis-encephalitis viruses, Maedi-Visna 
viruses, Maedi-Visna-like viruses, Maedi-Visna-related viruses, ovine lentiviruses, 

10 Visna lentiviruses), Simian AIDS retroviruses (e.g., human T-cell lymphotropic virus 
type 4), simian immunodeficiency viruses, simian-human immunodeficiency viruses, 
human lymphotrophic viruses (e.g., type III), simian T-cell lymphotrophic viruses. 

In another embodiment, cell lines are engineered to express the HIV proteins 
necessary for virus particle formation (gagpol proteins), without containing DNA 

1 5 sequences from HIV accessory proteins (tat, vif, vpr, vpu, nef and rev proteins and Rev 
response element (RRE)). Additionally, no viral sequences (such as cis-acting elements 
termed constitutive transport elements (CTEs)) will be expressed as RNA of any kind. 
DNA sequences for a HIV gagpol are codon optimized by mutagenesis to improve 
expression and to reduce the risk of recombination between transfer vector sequences 

20 and gagpol messenger RNA. In a particular embodiment, the DNA sequences for HIV 
gagpol are not codon optimized in the overlap region between the gag and pol 
sequences and in cis-acting signals necessary for translation of pol. 

Alternatively, each of the packaging cell lines described herein can be produced 
using (1) a nucleotide sequence which comprises a codon optimized gag coding 

25 sequence and (2) a nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the nucleotide sequence which comprises a codon optimized 
gagpol coding sequence. In this embodiment, the gag and pol coding sequences can be 
completely codon optimized 
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Benefits of the present invention include the removal of potentially harmful 
lentivirus accessory proteins and other viral sequences, and the reduction of the risk of 
recombination to produce replication competent virus. 

Packaging cell lines for producing a viral accessory protein independent 
5 lentivirus-derived retroviral vector particles comprise a mammalian cell and a retroviral 
nucleotide sequence comprising a coding sequence for a lentivirus gagpol which has 
been codon optimized. In a particular embodiment the packaging cell lines further 
comprise a retroviral nucleotide sequence comprising a coding sequence for a 
heterologous envelope protein. In a second embodiment, the packaging cell lines 

1 0 further comprise a retroviral nucleotide sequence comprising a coding sequence for a 
heterologous envelope protein and a retroviral nucleotide sequence which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
transcription and integration. In third embodiment, the packaging cell lines further 
comprise a retroviral nucleotide sequence which comprises a DNA sequence of interest 

1 5 and HIV cis-acting sequences required for packaging, reverse transcription and 

integration. Alternatively, the packaging cell lines of the present invention comprise a 
retroviral nucleotide sequence which comprises a codon optimized gag coding sequence 
and (2) a retroviral nucleotide sequence which comprises a codon optimized pol coding 
sequence, in place of the retroviral nucleotide sequence which comprises a codon 

20 optimized gagpol coding sequence. 

The coding sequence(s) for lentivirus gagpol which has (have) been codon 
optimized results in improved expression of the lentivirus gagpol proteins and reduces 
the risk of recombination between the transfer vector and gagpol messenger RNA. 
Codon optimization of the coding sequence(s) for lentivirus gagpol was obtained by 

25 mutagenizing for each particular amino acid residue, specific nucleic acid bases in a 
codon for the particular amino acid residue to a nucleic acid base which is present in a 
codon which occurs at a high frequency in genes which are highly expressed for the 
same amino acid residue. In a particular embodiment, the resulting optimized codon 
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also does not cause introduction of mRNA splicing signals into the codon optimized 
sequence. Thus, in a particular embodiment, codon optimization of the coding 
sequence(s) for lentivirus gagpol is obtained by mutagenizing for each particular amino 
acid residue, specific nucleic acid bases in a codon for the particular amino acid residue 
5 to a nucleic acid base that is present in a codon which (1) occurs at a high frequency in 
genes which are highly expressed for the same amino acid residue and (2) does not 
cause introduction of mRNA splicing signals into the codon optimized sequence. 
Codon optimization typically results in the removal of nucleic acid base A-rich 
instability elements. 

10 In a particular embodiment, the coding sequence for a HIV gagpol (pNL4-3; 

available through the AIDS repository, NIH; Adachi et al, J> Virol , 59:284-291 (1986)) 
has been codon optimized to improve translational efficiency of the HIV gagpol 
proteins and reduce the risk of recombination between the transfer vector and HIV 
gagpol messenger RNA. Two hundred thirty-seven base pairs (237 bp) consisting of the 

15 gag pol overlap and cis-acting signals necessary for translation of pol (nucleotides 2583 
to 2819 of SEQ ID NO: 12) were not optimized. The HIV gagpol sequence obtained 
using the codon optimization process does not differ at the amino acid level from the 
wildtype HIV gagpol sequence, but differs at the nucleotide level from the HIV gagpol 
sequence. A codon optimized HIV gag sequence is shown in Figures 8A-8E 

20 (pHDMHgpm2.seq) (SEQ ID NO:4). A codon optimized HIV pol sequence is shown in 
Figures 9A-9L (pHDMHgpm2.seq) (SEQ ID NO: 10). 

A plasmid comprising DNA sequences which encode codon optimized lentivirus 
gagpol proteins is also referred to herein as a packaging construct. This plasmid 
includes a promoter which drives the expression of the gagpol proteins, such as the 

25 human cytomegalovirus (hCMV) immediate early promoter. This plasmid is defective 
for the production of the viral envelope and accessory proteins tat, vif, vpr, vpu, nef and 
rev and the Rev response element (RRE). The packaging construct also does not 
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contain viral sequences which are transcribed into mRNA, such as constitutive transport 
elements (CTEs). 

A packaging construct comprising a codon optimized HIV gagpol is depicted in 
Figure 1 and in Figure 1 1 . Figures 10A-10D depict the DNA sequence (SEQ ID 
5 NO: 12) for the packaging construct pHDMHgpm2. This packaging construct 
(pHDMHgpm2) was constructed as follows: Plasmid pMDA.HIVgp mam was 
generated by chemical synthesis and PCR assembly (which is described in, for example, 
Stemmer et al, Gene, 164:49-53 (1995)) of 215 different oligonucleotides. The DNA 
sequence for pMDA.HIVgp mam is the same as the DNA sequence for 

10 pMDA.HIVgp jtg except for 4.3 kb which was codon optimized using the DNAStar 
program (LaserGene, Madison, WI). Two hundred thirty-seven base pairs (237 bp) 
consisting of the gag pol overlap and cis-acting signals necessary for translation of pol 
(nucleotides 2583 to 2819 of SEQ ID NO: 12) were not optimized due to dual reading 
frame constraints. A Nsil site 5' of IN was preserved to aid fusion with wildtype 

15 sequences. Several single or double base pair silent mutations were introduced either to 
prevent potential splice donors and acceptors, or by the synthesis process. 
pMDA.HIVgp jtg was derived from HIV-1 strain NL4-3. The protease mutation that is 
present in the NL4-3 NIH GenBank sequence was then repaired (Figure 9B), changing 
the nucleotide present at position 2948 of SEQ ID NO:12 from a "G" to a "C", thereby 

20 producing the codon present at nucleotide positions 2948 to 2950 of SEQ ID NO: 12 
which encodes an arginine instead of the glycine present in the NL4-3 GenBank amino 
acid sequence. The resulting plasmid was named pMDHgpmam. The EcoRI-Hindlll 
fragment of pMDHgpmam was inserted into pHDM2b, a high copy version of the pMD 
vector (Ory, D. et al, Proc. Natl Acad, Set USA, 93(21): 1 1400-1 1406 (1996)), to 

25 produce plasmid pHDMHgpm. The sequencing mutation that is present in the RNase 
domain of the NL4-3 NIH GenBank sequence was repaired (Figure 9H), changing the 
codon present at nucleotide positions 4724 to 4726 of SEQ ID NO: 12 from "GGG" to 
"AAG", thereby producing a codon encoding a lysine instead of the glycine present in 
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the NL4-3 GenBank amino acid sequence. The resulting plasmid was named 
pHDMHgpm2. Codon usage frequencies in the codon optimized gagpol open reading 
frame of the packaging construct pHDMHgpm2 are shown in Figure 2. 

As used herein, a heterologous envelope protein permits pseudotyping of 
5 particles generated by the packaging construct and includes the G glycoprotein of 
vesicular stomatitis virus (VS V G) and the amphotropic envelope of the Moloney 
leukemia virus (MLV). A plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein is also referred to herein as an envelope coding plasmid. 
The terms "mammal" and "mammalian", as used herein, refer to any vertebrate 

10 animal, including monotremes, marsupials and placental, that suckle their young and 
either give birth to living young (eutharian or placental mammals) or are egg-laying 
(metatharian or nonplacental mammals). Examples of mammalian species include 
humans and other primates (e.g., monkeys, chimpanzees), rodents (e.g., rats, mice, 
guinea pigs) and ruminents (e.g., cows, pigs, horses). 

15 Examples of mammalian cells include human (such as HeLa cells, 293 T cells, 

NIH 3T3 cells), bovine, ovine, porcine, murine (such as embryonic stem cells), rabbit 
and monkey (such as COS1 cells) cells. The cell may be a non-dividing cell (including 
hepatocytes, myofibers, hematopoietic stem cells, neurons) or a dividing cell. The cell 
may be an embryonic cell, bone marrow stem cell or other progenitor cell. Where the 

20 cell is a somatic cell, the cell can be, for example, an epithelial cell, fibroblast, smooth 
muscle cell, blood cell (including a hematopoietic cell, red blood cell, T-cell, B-cell, 
etc.), tumor cell, cardiac muscle cell, macrophage, dendritic cell, neuronal cell (e.g., a 
glial cell or astrocyte), or pathogen-infected cell (e.g., those infected by bacteria, 
viruses, virusoids, parasites, or prions). 

25 Typically, cells isolated from a specific tissue (such as epithelium, fibroblast or 

hematopoietic cells) are categorized as a "cell-type." The cells can be obtained 
commercially or from a depository or obtained directly from an animal, such as by 
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biopsy. Alternatively, the cell need not be isolated at all from the animal where, for 
example, it is desirable to deliver the virus to the animal in gene therapy. 

To produce the cell lines of the present invention for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particles, mammalian host cells 

5 are co-transfected with (a) a first plasmid comprising DNA sequence which encode 
lentivirus gagpol proteins, wherein said DNA sequence has been codon optimized by 
mutagenisis, as described above, to improve expression of the lentivirus gagpol 
proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 

10 DNA sequence of interest and lentivirus cis-acting sequences required for packaging, 
reverse transcription and integration, or both, under conditions appropriate for 

transfection of the cells. 

In a particular embodiment, to produce the cell lines of the present invention for 
producing viral accessory protein independent HIV-derived retroviral vector particles 

1 5 mammalian host cells were cotransfected with (a) a first plasmid comprising DNA 
sequence which encode HIV gagpol proteins, wherein said DNA sequence has been 
codon optimized by mutagenisis, as described above, to improve expression of the HIV 
gagpol proteins; and (2) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein, or a retroviral nucleotide sequence which comprises a 

20 DNA sequence of interest and HIV cis-acting sequences required for packaging, reverse 
transcription and integration, or both, under conditions appropriate for transfection of 
the cells. 

Virus stocks consisting of viral accessory protein independent lentivirus-derived, 
particularly HIV-derived, retroviral vector particles of the present invention are 
25 produced by maintaining the transfected cells under conditions suitable for virus 

production (e.g., in an appropriate growth media and for an appropriate period of time). 
Such conditions, which are not critical to the invention, are generally known in the art. 
See, e.g., Sambrook et al y Molecular Cloning: A Laboratory Manual^ Second Edition, 
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Cold Spring Harbor University Press, New York (1989); Ausubel et ah, Current 
Protocols in Molecular Biology, John Wiley & Sons, New York (1998); U.S. Patent 
No. 5,449,614; and U.S. Patent No. 5,460,959, the teachings of which are incorporated 
herein by reference. 

5 To generate viral accessory protein independent lentivirus-derived retroviral 

vector particles, mammalian host cells can be co-transfected with (a) a first plasmid 
comprising DNA sequence which encode lentivirus gagpol proteins, wherein said DNA 
sequence has been codon optimized by mutagenisis, as described above, to improve 
expression of the lentivirus gagpol proteins; (b) a second plasmid comprising a DNA 

10 sequence which encodes a heterologous envelope protein; and (c) a third plasmid 

comprising a DNA sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. Alternatively, mammalian cells are 
transfected with a plasmid comprising a codon optimized DNA sequence encoding a 
lentivirus gag protein and a plasmid comprising a codon optimized DNA sequence 

15 encoding a lentivirus pol protein, in place of the first plasmid comprising a codon 
optimized DNA sequence encoding both lentivirus gagpol proteins. Alternatively, 
mammalian host cells are transfected with a plasmid comprising a codon optimized 
DNA sequence encoding a lentivirus gag protein and a plasmid comprising a codon 
optimized DNA sequence encoding a lentivirus pol protein, in place of the first plasmid 

20 comprising a codon optimized DNA sequence encoding both lentivirus gagpol proteins. 

In a particular embodiment, the invention relates to methods of producing viral 
accessory protein independent HIV-derived retroviral vector particles, comprising co- 
transfecting mammalian host cells with (a) a first plasmid comprising DNA sequence 
which encode HIV gagpol proteins, wherein said DNA sequence has been codon 

25 optimized by mutagenisis, as described above, to improve expression of the HIV gagpol 
proteins; (b) a second plasmid containing a DNA sequence which encodes a 
heterologous envelope protein; and (c) a third plasmid comprising a DNA sequence of 
interest and HIV cis-acting sequences required for packaging, reverse transcription and 
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integration. Alternatively, mammalian host cells are transfected with a plasmid 
comprising a codon optimized DNA sequence encoding a HIV gag protein and a 
plasmid comprising a codon optimized DNA sequence encoding a HIV pol protein, in 
place of the first plasmid comprising a codon optimized DNA sequence encoding both 
5 HIV gagpol proteins. 

Virus particles produced by the methods described herein, using a codon 
optimized HIV packaging construct produced as described herein, were compared by 
Western analysis with virus particles produced as described in Naldini et al^ Science, 
272:263-267 (1996), using the packaging construct plasmid pCMVAR8.2. Both the 
10 immunological reactivity and the proteolytic processing were confirmed to be 
indistinguishable. 

A plasmid comprising a DNA sequence of interest and HIV cis-acting sequences 
required for packaging, reverse transcription and integration is also referred to herein as 
a transfer vector. A transfer vector, as used herein, refers to a vehicle which is used to 

1 5 introduce a DNA of interest into a eurkaryotic cell, particularly a mammalian cell. 
Figure 3 depicts an example of a transfer vector. 

DNA sequence of interest, as used herein, include all or a portion of a gene or 
genes encoding a nucleic acid product whose expression in a cell or a mammal is 
desired. In a particular embodiment, the nucleic acid product is a heterologous 

20 therapeutic protein. Examples of therapeutic proteins include antigens or immunogens, 
such as a polyvalent vaccine, cytokines, tumor necrosis factor, interferons, interleukins, 
adenosine deaminase, insulin, T-cell receptors, soluble CD4, growth factors, such as 
epidermal growth factor, human growth factor, insulin-like growth factors, fibroblast 
growth factors), blood factors, such as Factor VIII, Factor IX, cytochrome b, 

25 glucocerebrosidase, ApoE, ApoC, ApoAI, the LDL receptor, negative selection markers 
or "suicide proteins", such as thymidine kinase (including the HSV, CMV, VZV TK), 
anti-angiogenic factors, Fc receptors, plasminogen activators, such as t-PA, u-PA and 
streptokinase, dopamine, MHC, tumor suppressor genes such as p53 and Rb, 
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monoclonal antibodies or antigen binding fragments thereof, drug resistance genes, ion 
channels, such as a calcium channel or a potassium channel, adrenergic receptors, 
hormones (including growth hormones) and anti-cancer agents. In another embodiment, 
the nucleic acid product is a gene product to be expressed in a cell or a mammal and 
5 which product is otherwise defective or absent in the cell or mammal. For example, the 
nucleic acid product can be a functional gene(s) which is defective or absent in the cell 
or mammal. 

DNA sequence of interest includes DNA sequences (control sequences) which 
are necessary to drive the expression of the gene or genes. The control sequences are 
10 operably linked to the gene. The term "operably linked", as used herein, is defined to 
mean that the gene is linked to control sequences in a manner which allows expression 
of the gene (or the nucleic acid sequence). Generally, operably linked means 
contiguous. 

Control sequences include a transcriptional promoter, an optional operator 
15 sequence to control transcription, a sequence encoding suitable mRNA ribosomal 

binding sites and sequences which control termination of transcription and translation. 
In a particular embodiment, a recombinant gene encoding a desired nucleic acid product 
can be placed under the regulatory control of a promoter which can be induced or 
repressed, thereby offering a greater degree of control with respect to the level of the 
20 product produced. 

As used herein, the term "promoter" refers to a sequence of DNA, usually 
upstream (5 ! ) of the coding region of a structural gene, which controls the expression of 
the coding region by providing recognition and binding sites for RNA polymerase and 
other factors which may be required for initiation of transcription. Suitable promoters 
25 are well known in the art. Exemplary promoters include the SV40, CMV and human 
elongation factor (EFI) promoters. Other suitable promoters are readily available in the 
art (see, e.g., Ausubel et ah, Current Protocols in Molecular Biology, John Wiley & 
Sons, Inc., New York (1998); Sambrook et aL, Molecular Cloning: A Laboratory 
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Manual, 2nd edition, Cold Spring Harbor University Press, New York (1989); and U.S. 
Patent No. 5,681,735). 

A DNA sequence of interest can be isolated from nature, modified from native 
sequences or manufactured de novo, as described in, for example, Ausubel et al, 
5 Current Protocols in Molecular Biology, John Wiley & Sons, New York (1998); and 
Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd edition, Cold Spring 
Harbor University Press, New York. (1989). DNA sequences can be isolated and fused 
together by methods known in the art, such as exploiting and manufacturing compatible 
cloning or restriction sites. 

10 The packaging cell lines and viral particles of the present invention can be used, 

in vitro, in vivo and ex vivo, to introduce DNA of interest into a eukaryotic cell (e.g., a 
mammalian cell) or a mammal (e.g., a human or other mammal or vertebrate). The cells 
can be obtained commercially or from a depository or obtained directly from a mammal, 
such as by biopsy. The cells can be obtained from a mammal to whom they will be 

1 5 returned or from another/different mammal of the same or different species. For 

example, using the packaging cell lines or viral particles of the present invention, DNA 
of interest can be introduced into nonhuman cells, such as pig cells, which are then 
introduced into a human. Alternatively, the cell need not be isolated from the mammal 
where, for example, it is desirable to deliver vial particles of the present invention to the 

20 mammal in gene therapy. 

Ex vivo therapy has been described, for example, in Kasid et al, Proc. Natl. 
Acad Set USA, 57:473 (1990); Rosenberg etal, N Engl X Med, 323:570 (1990); 
Williams etal, Nature, 570:476 (1984); Dick etal, Cell, 42:11 (1985); Keller etal, 
Nature, 375:149 (1985); and Anderson et al, United States Patent No. 5,399,346. 

25 Methods for administering (introducing) viral particles directly to a mammal are 

generally known to those practiced in the art. For example, modes of administration 
include parenteral, injection, mucosal, systemic, implant, intraperitoneal, oral, 
intradermal, transdermal (e.g., in slow release polymers), intramuscular, intravenous 
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including infusion and/or bolus injection, subcutaneous, topical, epidural, etc. Viral 

particles of the present invention can, preferably, be administered in a pharmaceutical^ 
acceptable carrier, such as saline, sterile water, Ringer's solution, and isotonic sodium 
chloride solution. 

5 The dosage of a viral particle of the present invention administered to a 

mammal, including frequency of administration, will vary depending upon a variety of 
factors, including mode and route of administration; size, age, sex, health, body weight 
and diet of the recipient mammal; nature and extent of symptoms of the disease or 
disorder being treated; kind of concurrent treatment, frequency of treatment, and the 
10 effect desired. 

The teachings of all the articles, patents, patent applications and GenBank 
sequences cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art that 
1 5 various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 
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CLAIMS 



What is claimed is: 



1 . A packaging cell line for producing a viral accessory protein independent HIV- 
derived retroviral vector particle comprising: 
5 a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
1 0 coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 



2. A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
15 the G glycoprotein of vesicular stomatitis virus (VSV G). 



3. * A packaging cell line of Claim 1 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 



4. A packaging cell line of Claim 1 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 



20 5. 



A packaging cell line comprising: 
a) a mammalian cell; 



-22" 



b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and HIV cis-acting sequences required for 
packaging, reverse transcription and integration. 

A packaging cell line of Claim 5 wherein the DNA sequence of interest encodes 
a heterologous therapeutic protein. 

A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a HIV gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the HIV gagpol proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

A method of producing a packaging cell line for producing a viral accessory 
protein independent HIV-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gag and pol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 
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c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 

9. A method of Claim 8 wherein the heterologous envelope protein is the G 
5 glycoprotein of vesicular stomatitis virus (VSV G). 

10. A method of Claim 8 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

11. A method of Claim 8 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

10 12. A method of producing a viral accessory protein independent HI V-derived 

retroviral vector particle comprising co-transfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gagpol proteins; 
15 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
c) a third plasmid comprising a DNA sequence of interest and HIV cis- 
acting sequences required for packaging, reverse transcription and 
integration. 



20 13. 



A method of Claim 12 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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A method of Claim 12 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

A method of Claim 12 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

A packaging cell line for producing a viral accessory protein independent 
lenti virus-derived retroviral vector particle comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for a lentivirus gagpol, wherein said coding sequence 
has been mutagenized to improve expression of the lentivirus gagpol 
proteins; 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein; and 

d) a third retroviral nucleotide sequence in the cell which comprises a DNA 
sequence of interest and lentivirus cis-acting sequences required for 
packaging, reverse transcription and integration. 

A packaging cell line of Claim 1 6 wherein the heterologous envelope protein is 
the G glycoprotein of vesicular stomatitis virus (VSV G). 

A packaging cell line of Claim 1 6 wherein the heterologous envelope protein is 
the amphotropic envelope of the Moloney leukemia virus. 

A packaging cell line of Claim 16 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 
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A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises a 
DNA sequence of interest and lentivirus cis-acting sequences required 
for packaging, reverse transcription and integration. 

A packaging cell line of Claim 20 wherein the DNA sequence of interest 
encodes a heterologous therapeutic protein. 

A packaging cell line comprising: 

a) a mammalian cell; 

b) a first retroviral nucleotide sequence in the cell which comprises a 
coding sequence for lentivirus gagpol, wherein said coding sequence has 
been mutagenized to improve expression of the lentivirus gagpol 
proteins; and 

c) a second retroviral nucleotide sequence in the cell which comprises the 
coding sequence for a heterologous envelope protein. 

A method of producing a packaging cell line for producing a viral accessory 
protein independent lentivirus-derived retroviral vector particle, comprising co- 
transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gag and pol proteins; 
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b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration. 

A method of Claim 23 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

A method of Claim 23 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

A method of Claim 23 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 

A method of producing a viral accessory protein independent lentivirus-derived 
retroviral vector particle comprising co-transfecting mammalian host cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration. 

A method of Claim 27 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 
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29. A method of Claim 27 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

30. A method of Claim 27 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

5 31. A viral accessory protein independent HIV-derived retroviral vector particle 

produced by the method comprising co-transfecting mammalian host cells with: 
a) a first plasmid comprising a DNA sequence which encodes HIV gagpol 
proteins, wherein said DNA sequence has been mutagenized to improve 
expression of the HIV gagpol proteins; 
10 b) a second plasmid comprising a DNA sequence which encodes a 

heterologous envelope protein; and 
2 c) a third plasmid comprising a DNA sequence of interest and HIV cis- 

acting sequences required for packaging, reverse transcription and 
□ integration. 

Q 1 5 32. A method of Claim 3 1 wherein the heterologous envelope protein is the G 

Ij glycoprotein of vesicular stomatitis virus (VSV G). 

33. A method of Claim 3 1 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 



34. 

20 



A method of Claim 3 1 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 
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A viral accessory protein independent lentivirus-derived retroviral vector 
particle produced by the method comprising co-transfecting mammalian host 
cells with: 

a) a first plasmid comprising a DNA sequence which encodes lentivirus 
gagpol proteins, wherein said DNA sequence has been mutagenized to 
improve expression of the lentivirus gagpol proteins; 

b) a second plasmid comprising a DNA sequence which encodes a 
heterologous envelope protein; and 

c) a third plasmid comprising a DNA sequence of interest and lentivirus 
cis-acting sequences required for packaging, reverse transcription and 
integration, 

A method of Claim 35 wherein the heterologous envelope protein is the G 
glycoprotein of vesicular stomatitis virus (VSV G). 

A method of Claim 35 wherein the heterologous envelope protein is the 
amphotropic envelope of the Moloney leukemia virus. 

A method of Claim 35 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

Isolated DNA encoding a codon optimized HIV gagpol 

Isolated DNA encoding a codon optimized HIV gag. 

Isolated DNA of Claim 40 comprising the nucleotide sequence of SEQ ID NO:4. 
Isolated DNA encoding a codon optimized HIV poL 
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Isolated DNA of Claim 42 comprising the nucleotide sequence of SEQ ID 
NO: 10. 

A method of introducing a DNA sequence of interest into a mammal comprising 
introducing into said mammal a viral accessory protein independent HIV- 
derived retroviral vector particle comprising the DNA sequence of interest. 

The method of Claim 44 wherein the mammal is a human. 

The method of Claim 44 wherein the DNA sequence of interest encodes a 
heterologous therapeutic protein. 

A method of introducing a DNA sequence of interest into a mammal comprising 
the steps of: 

a) introducing into cells a viral accessory protein independent HIV-derived 
retroviral vector particle comprising the DNA sequence of interest; and 

b) returning the cells obtained in step a) to the mammal. 

The method of Claim 47 wherein the mammal is a human. 

The method of Claim 47 wherein the DNA sequence of interest is a heterologous 
therapeutic protein. 
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PACKAGING CELL LINES 

ABSTRACT OF THE DISCLOSURE 

Novel packaging cell lines useful for generating viral accessory protein 
independent HIV-derived retroviral vector particles, methods of constructing such 
5 packaging cell lines and methods of using the viral accessory protein independent HIV- 
derived retroviral vector particles are disclosed. 
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Regulates HIV gene expression by promoting 
cytoplasmic levels of unspliced and singly spliced 
mRNAs 

Postulated to affect splicing, stability, transport, 
and translation 



Fig. 4 



Codon Optimization of HIV gagpol 

• Remove A-rich instability elements 

• Improve translational efficiency 

• Reduce risk of recombination with 
transfer vector 



Fig. 5 
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Alignment Report of Codon optimization (gag). MEG, using CJustai method with PAM250 residue weight table. 
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792 MGARASVL S GGELDK 

792 ATG GGT GCG AGA GCG TCG GTA TTA AGC GGG GGA GAA TTA GAT AAA 

1319 MGARASVLSGGELDK 

1319 ATG GGC GCC CGC GCC TCC GTG CTG TCC GGC GGC GAG CTG GAC AAG 



NL4-3 genbank.SEQ 
pHDMHgpm2 . seq 
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840 
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870 
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837 WEKIRLRPGGKKQYK 

837 TGG GAA AAA ATT CGG TTA AGG CCA GGG GGA AAG AAA CAA TAT AAA 

1364 WEKIRLRPGGKKQYK 

1364 TGG GAG AAG ATC CGC CTG CGC CCC GGC GGC AAG AAG CAG TAC AAG 



NL4-3 genbank.SEQ 
pHDMHgpm2 .seq 



882 
882 
1409 
14 09 



r 

900 
1 



LKHIVWAS RELERFA NL4-3 genbank.SEQ 
CTA AAA CAT ATA GTA TGG GCA AGC AGG GAG CTA GAA CGA TTC GCA 

LKHIVWAS RELERFA pHDMHgpm2 . s eq 
CTG AAG CAC ATC GTG TGG GCC TCC CGC GAG CTG GAG CGC TTC GCC 



— I — 

930 
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960 



VNPGLLET SEGCRQI NL4-3 genbank.SEQ 
GTT AAT CCT GGC CTT TTA GAG ACA TCA GAA GGC TGT AGA CAA ATA 
VNPGLLETSEGCRQI pHDMHgpm2 . seq 



927 
927 
1454 

1454 GTG AAC CCC GGC CTG CTG GAG ACC TCC GAG GGC TGC CGC CAG ATC 

i 

990 

l 

972 LGQLQPSLQTGSEEL NL4-3 genbank.SEQ 
972 CTG GGA CAG CTA CAA CCA TCC CTT CAG ACA GGA TCA GAA GAA CTT 
1499 LGQLQPSLQTGSEEL pHDMHgpm2 . seq 
1499 CTG GGC CAG CTG CAG CCC TCC CTG CAA ACC GGC TCC GAG GAG CTG 



1020 



— I 

1050 
t 



RSLYNT IAVLYCVHQ NL4-3 genbank.SEQ 
AGA TCA TTA TAT AAT ACA ATA GCA GTC CTC TAT TGT GTG CAT CAA 
RSLYNT IAVLYCVHQ pHDMHgpm2 . seq 



1062 
1062 
1589 
1589 



1080 
i 



RIDVKDTKEALDKIE 
AGG ATA GAT GTA AAA GAC ACC AAG GAA GCC TTA GAT AAG ATA GAG 

RIDVKDTKEALDKIE 
CGC ATC GAC GTG AAG GAC ACC AAG GAG GCC CTG GAC AAG ATC GAG 



NL4-3 genbank.SEQ 
pHDMHgpm2 . s eq 



1107 
1107 
1634 
1634 



i 

1110 
i 
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1140 



EEQNKS KKKAQQAAA NL4-3 genbank.SEQ 
GAA GAG CAA AAC AAA AGT AAG AAA AAG GCA CAG CAA GCA GCA GCT 

EEQNKS KKKAQQAAA pHDMHgpm2 . seq 
GAG GAG CAG AAC AAG TCC AAG AAG AAG GCC CAG CAG GCC GCC GCC 



Fig. 8A 



Alignment Report of Codon optimization (gag). MEG r using Ciustal method with PAM250 residue weight table. 
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1152 DTGMNSQVS QNYPIV HL4-3 genbank.SEQ 

1152 GAC ACA GGA AAC AAC AGC CAG GTC AGC CAA AAT TAC CCT ATA GTG 

1679 DTGNNSQVSQNYPIV pHDMHgpm2 . seq 

1679 GAC ACC GGC AAC AAC TCC CAG GTG TCC CAG AAC TAC CCC ATC GTG 

1 1 — 

1200 1230 

1 ! 

1197 QNLQGQMVHQAI S PR NL4-3 genbank.SEQ 

1197 CAG AAC CTC CAG GGG CAA ATG GTA CAT CAG GCC ATA TCA CCT AGA 

1724 QNLQGQMVHQAISPR pHDMHgprrt2 . seq 

1724 CAG AAC CTG CAG GGC CAG ATG GTG CAC CAG GCC ATC TCC CCC CGC 

! ■ ~ 

1260 

I 

1242 TLNAWVKVVEEKAFS NL4-3 genbank.SEQ 
1242 ACT TTA AAT GCA TGG GTA AAA GTA GTA GAA GAG AAG GCT TTC AGC 
1769 TLNAWVKVVEEKAFS pHDMHgpia2 . seq 
1769 ACC CTG AAC GCC TGG GTG AAG GTG GTG GAG GAG AAG GCC TTC TCC 

1 1 

1290 1320 
1 i 

1287 P EVI PMFSALS SGAT NL4-3 genbank.SEQ 

128 7 CCA GAA GTA ATA CCC ATG TTT TCA GCA TTA TCA GAA GGA GCC ACC 
1814 PEVIPMFSALSEGAT pHDMHgpm2 . seq 

1814 CCC GAA GTC ATC CCC ATG TTC TCC GCC CTG TCC GAG GGC GCC ACC 

J 

1350 

I 

1332 PQDLNTMLNTVGGHQ NL4-3 genbank.SEQ 
1332 CCA CAA GAT TTA AAT ACC ATG CTA AAC ACA GTG GGG GGA CAT CAA 

1859 PQDLNTMLNTVGGHQ pHDMHgpm2 . seq 
1859 CCC CAG GAC CTG AAC ACC ATG CTG AAC ACC GTG GGC GGC CAC CAG 

j 1 

1380 1410 
l I 

1377 AAMQMLKET INEEAA NL4-3 genbank.SEQ 

1377 GCA GCC ATG CAA ATG TTA AAA GAG ACC ATC AAT GAG GAA GCT GCA 

1904 AAMQMLKETINEEAA pHDMHgpm2 . seq 

19 C4 GCC GCC ATG CAG ATG CTG AAG GAG ACC ATC AAC GAG GAG GCC GCC 

I 

1440 

1 

1422 EWDRLHPVHAGPIAP NL4-3 genbank.SEQ 
1422 GAA TGG GAT AGA TTG CAT CCA GTG CAT GCA GGG CCT ATT GCA CCA 
1949 EWDRLHPVHAGPIAP pHDMHgpm2 . seq 
1949 GAG TGG GAC CGC CTG CAC CCC GTG CAC GCC GGC CCC ATC GCC CCC 

1 1 

1470 1500 

! I 

14 67 GQMRE P R G S D I A GT T NL4-3 genbank.SEQ 
1467 GGC CAG ATG AGA GAA CCA AGG GGA AGT GAC ATA GCA GGA ACT ACT 

1994 GQMREPRGS DIAGTT pHDMHgpm2 . seq 
1994 GGC CAG ATG CGC GAG CCC CGC GGC TCC GAC ATC GCC GGC ACC ACC 



Fig. 8B 



Alignment Report of Codon optimization (gag).MEG, using Clustai method with PAM250 residue weight table. 
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1512 STLQSQIGWMTHNPP NL4-3 genbank.SEQ 

1512 AGT ACC CTT CAG GAA CAA ATA GGA TGG ATG ACA CAT AAT CCA CCT 

2039 STLQEQIGWMTHNPP pHDMHgpm2 . seq 

2039 TCC ACC CTG CAA GAG CAG ATC GGC TGG ATG ACC CAC AAC CCC CCC 

j 1 

1560 1590 
I __ j 



1557 IPVGEIYKRWIILGL NL4-3 genbank.SEQ 

1557 ATC CCA GTA GGA GAA ATC TAT AAA AGA TGG ATA ATC CTG GGA TTA 

2084 IPVGEIYKRWIILGL pHDMHgpm2 . seq 

2084 ATC CCC GTG GGC GAG ATC TAC AAG CGC TGG ATC ATC CTG GGC CTG 

1 : 

1620 

„ 1 

1602 NKIVRMYS PTSILDI NL4-3 genbank.SEQ 

1602 AAT AAA ATA GTA AGA ATG TAT AGC CCT ACC AGC ATT CTG GAC ATA 

2129 NKIVRMYS P T SILDI pHDMHgpm2 . seq 
2129 AAC AAG ATC GTG CGC ATG TAC . TCC CCC ACC TCC ATC CTG GAC ATC 

I n 

1650 1680 

I 1 

1647 RQGPKEPFRDYVDRF NL4-3 genbank.SEQ 

1647 AGA CAA GGA CCA AAG GAA CCC TTT AGA GAC TAT GTA GAC CGA TTC 

2174 RQGPKEPFRDYVDRF pHDMHgpm2 . seq 
2174 CGC CAG GGC CCC AAG GAG CCC TTC CGC GAC TAC GTG GAC CGC TTC 
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1692 YKTLRAEQASQEVKN NL4-3 genbank.SEQ 

1692 TAT AAA ACT CTA AGA GCC GAG CAA GCT TCA CAA GAG GTA AAA AAT 

2219 YKTLRAEQASQEVKN pHDMHgpm2 . seq 

2219 TAC AAG ACC CTG CGC GCC GAG CAG GCC TCC CAG GAG GTA AAG AAC 



1 f 

1740 1770 
i 1 

1737 WMTETLLVQNANPDC NL4-3 genbank.SEQ 

1737 TGG ATG ACA GAA ACC TTG TTG GTC CAA AAT GCG AAC CCA GAT TGT 

2264 WMTETLLVQNANPDC pHDMHgpm2 . seq 
2264 TGG ATG ACC GAG ACC CTG CTG GTG CAG AAC GCC AAC CCC GAC TGC 

! — 

1800 

i 



1782 KTILKALGPGATLEE NL4-3 genbank.SEQ 

1782 AAG ACT ATT TTA AAA GCA TTG GGA CCA GGA GCG ACA CTA GAA GAA 

2309 KTILKALGPGATLEE pHDMHgpm2 . seq 

2309 AAG ACC ATC CTG AAG GCC CTG GGC CCC GGC GCC ACC CTG GAG GAG 

j . - | 

1830 1860 

' — 1 

1827 MMTACQGVGGPGHKA NL4-3 genbank.SEQ 

1827 ATG ATG ACA GCA TGT CAG GGA GTG GGG GGA CCC GGC CAT AAA GCA 

2354 MMTACQGVGGPGHKA pHDMHgpm2 . seq 
2354 ATG ATG ACC GCC TGC CAG GGC GTG GGC GGC CCC GGC CAC AAG GCC 

Fig. 8C 



Alignment Report of Codon optimization (gag).MEG, using Clustai method with PAM250 residue weight table. 
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1872 RVLAEAMS Q V T N P A T NL4-3 genbank.SEQ 

1872 AGA GTT TTG GCT GAA GCA ATG AGC CAA GTA ACA AAT CCA GCT ACC 

2399 RVLAEAM S QVTN PAT pHDMHgpm2 . s eq 

2399 CGC GTG CTG GCC GAG GCC ATG TCC CAA GTC ACC AAC CCC GCC ACC 
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1920 
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1917 imIQKGNFRNQRKTV NL4-3 genbank.SEQ 

1917 ATA ATG ATA CAG AAA GGC AAT TTT AGG AAC CAA AGA AAG ACT GTT 

2444 IMIQKGNFRNQRKTV pHDMHgpm2 . s eq 

2444 ATC ATG ATC CAG AAG GGC AAC TTC CGC AAC CAG CGC AAG ACC GTG 
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1962 KCFNCGKEGHIAKNC NL4-3 genbank.SEQ 

1962 AAG TGT TTC AAT TGT GGC AAA GAA GGG CAC ATA GCC AAA AAT TGC 

2489 KCFNCGKEGHIAKNC pHDMHgpm2 . s eq 

2489 AAG TGC TTC AAC TGC GGC AAG GAG GGC CAC ATC GCC AAG AAC TGC 
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2007 RAPRKKGCWKCGKEG NL4-3 genbank.SEQ 
2007 _ 

2534 RAPRKKGCWKCGKEG pHDMHgpm2 . seq 
2534 



2052 HQMKDCTERQANFLG NL4-3 genbank.SEQ 
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2579 HQMKDCTERQANFLG pHDMHgpm2 . seq 
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2097 KIWPSHKGRPGNFLQ NL4-3 genbank.SEQ 

2097 
2624 
2624 



2142 SRPEPTAPPEESFRF NL4-3 genbank.SEQ 

2142 

2669 SRPEPTAPPEESFRF pHDMHgpm2 . seq 
2669 



2187 GEETTTPSQKQEPID NL4-3 genbank.SEQ 

2187 

2714 GEETTTPSQKQEPID pHDMHgpm2 . seq 
2714 
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Alignment Report of Codon optimization (gag).MEG, using CJustal method with PAM250 residue weight tabie. 
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KELYPLASLRSLFGS 

AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC 

KELYPLASLRSLFGS 
AAG GAA CTG TAT CCT TTA GCT TCC CTC AGA TCA CTC TTT GGC AGC 
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2277 D P S S Q 

2277 GAC CCC TCG TCA CAA^?S^ 

2804 D P S S Q 

2804 GAC CCC TCG TCA CAAr^Aj 



NL4-3 genbank. SEQ 
pHDMHgpm2 . seq 



NL4-3 genbank. SEQ 
pHDMHgpm2 . seq 



Fig. 8E 



Alignment Report of Codon Optimization (pol).MEG, using Ciustal method with PAM250 residue weight table. 
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2177 LQVWGRDNNS LSEAG NL4-3 genbank. SEQ 

2177 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

2175 LQVWGRDNNS LSEAG pNL4-3 . seq 

2175 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 

2702 LQVWGRDNNSLSEAG pHDMHgpm2 . seq 

2702 CTT CAG GTT TGG GGA AGA GAC AAC AAC TCC CTC TCA GAA GCA GGA 
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2267 LWQRPLVTIKIGGQL NL4-3 genbank. SEQ 

22 67 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATA GGG GGG CAA TTA 

2265 LWQRPLVTIKIGGQL pNL4-3.seq 

2265 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATA GGG GGG CAA TTA 

2792 LWQRPLVTIKIGGQL pHDMHgpm2 . s eq 

2792 CTT TGG CAG CGA CCC CTC GTC ACA ATA AAG ATC GGT GGC CAG CTG 
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Alignment Report of Codon Optimization (poi).MEG, using Ciustal method with PAM250 residue weight table. 
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EMNLPGRWKPKMIGG 
GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

EMNLPGRWKPKMIGG 
GAA ATG AAT TTG CCA GGA AGA TGG AAA CCA AAA ATG ATA GGG GGA 

EMNLPGRWKPKMIGG 
GAG ATG AAC CTG CCC GGC CGC TGG AAG CCC AAG ATG ATC GGC GGC 
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EICGHKAIGTVLVGP 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

EICGHKAIGTVLVGP 
GAA ATC TGC GGA CAT AAA GCT ATA GGT ACA GTA TTA GTA GGA CCT 

EICGHKAIGTVLVGP 
GAG ATC TGC GGC CAC AAG GCC ATC GGC ACC GTG CTG GTG GGC CCC 
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TPVNI IGRNLLTQIG 
ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 

TPVNI IGRNLLTQIG 
ACA CCT GTC AAC ATA ATT GGA AGA AAT CTG TTG ACT CAG ATT GGC 
TPVNI IGRNLLTQIG 
3017 ACC CCC GTG AAC ATC ATC GGC CGC AAC CTG CTG ACC CAG ATC GGC 
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CCC ATT AGT CCT ATT GAG ACT GTA CCA GTA 
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3062 TGC ACC CTG AAC TTC CCC ATC TCC CCC ATC GAG ACC GTG CCC GTG 

1 

2600 

i 

2582 KLKPGMDGPKVKQWP NL4-3 genbank.SEQ 

2582 AAA TTA AAG CCA GGA ATG GAT GGC CCA AAA GTT AAA CAA TGG CCA 

2580 KLKPGMDGPKVKQWP pNL4-3.seq 

2580 AAA TTA AAG CCA GGA ATG GAT GGC CCA AAA GTT AAA CAA TGG CCA 



3107 K 



K 



M 



D 



K 



V 



K 



W 



P pHDMHgpm2 . seq 



3107 AAG CTG AAG CCC GGC ATG GAC GGC CCC AAA GTC AAG CAG TGG CCC 



Fig. 9B 



Alignment Report of Codon Optimization (poi).MEG, using Ciustai method with PAM250 residue weight table. 



I i 

2630 2660 
i I 

2627 LTEEKI KALVEICTE NL4-3 genbank.SEQ 

2627 TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

2625 LTEEKI KALVEICTE pNL4-3.seq 

2625 TTG ACA GAA GAA AAA ATA AAA GCA TTA GTA GAA ATT TGT ACA GAA 

3152 LTEEKI KALVEICTE pHDMHgpm2 . seq 

3152 CTG ACC GAG GAG AAG ATC AAG GCC CTG GTG GAG ATC TGC ACC GAG 

1 

2690 

I 

2672 MEKEGKI SKIGPENP NL4-3 genbank.SEQ 

2672 ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 

2670 MEKEGKISKIGPENP pNL4-3.seq 

2670 ATG GAA AAG GAA GGA AAA ATT TCA AAA ATT GGG CCT GAA AAT CCA 

3197 MEKEGKI SKIGPENP pHDMHgpm2 . seq 
3197 ATG GAG AAG GAG GGC AAG ATC TCC AAG ATC GGC CCC GAG AAC CCC 

1 1 — i 

2720 2750 

! I 

2717 YNTPVFAIKKKDSTK NL4-3 genbank.SEQ 

2717 TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

2715 YNTPVFAIKKKDSTK pNL4-3. seq 

2715 TAC AAT ACT CCA GTA TTT GCC ATA AAG AAA AAA GAC AGT ACT AAA 

3242 YNTPVFAIKKKDSTK pHDMHgpm2 . s eq 

3242 TAC AAC ACC CCC GTG TTC GCC ATC AAG AAG AAG GAC TCC ACC AAG 

1 

2780 

I 

2762 WRKLVD FRELNKRTQ NL4-3 genbank.SEQ 

27 62 TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAG AGA ACT CAA 

2760 WRKLVD FRELNKRTQ pNL4-3 . seq 

2760 TGG AGA AAA TTA GTA GAT TTC AGA GAA CTT AAT AAG AGA ACT CAA 

3287 WRKLVD FRELNKRTQ pHDMHgpm2 . seq 

32 87 TGG CGC AAG CTG GTG GAC TTC CGC GAG CTG AAC AAG CGC ACC CAG 

1 : i 

2810 2840 
i 1 

2807 DFWEVQLGI PHPAGL NL4-3 genbank.SEQ 

2 807 GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

2805 DFWEVQLGIPHPAGL pNL4-3.seq 

2805 GAT TTC TGG GAA GTT CAA TTA GGA ATA CCA CAT CCT GCA GGG TTA 

3332 DFWEVQLGI PHPAGL pHDMHgpm2 . seq 

3332 GAC TTC TGG GAG GTG CAG CTG GGC ATC CCC CAC CCC GCC GGC CTG 

1 

2870 

I 

2852 KQKKSVTVLDVGDAY NL4-3 genbank.SEQ 

2852 AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

2850 KQKKSVTVLDVGDAY pNL4-3.seq 

2850 AAA CAG AAA AAA TCA GTA ACA GTA CTG GAT GTG GGC GAT GCA TAT 

3377 KQKKSVTVLDVGDAY pHDMHgpm2 . seq 

3377 AAG CAG AAG AAG TCC GTG ACC GTG CTG GAC GTG GGC GAC GCC TAC 



Fig. 9C 



Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 



1 1 

2900 2930 

! I 

2897 FSVPLDKDFRKYTAF NL4-3 genbank. SEQ 

2897 TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

2895 FSVPLDKDFRKYTAF pNL4-3.seq 

28 95 TTT TCA GTT CCC TTA GAT AAA GAC TTC AGG AAG TAT ACT GCA TTT 

3422 FSVPLDKDFRKYTAF pHDMHgpm2 . seq 

3422 TTC TCC GTG CCC CTG GAC AAG GAC TTC CGC AAG TAC ACC GCC TTC 

1 

2960 

l 

2942 TIPS INNETPGIRYQ NL4-3 genbank, SEQ 

2942 ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

2940 TIPS INNETPGIRYQ pNL4-3.seq 

2940 ACC ATA CCT AGT ATA AAC AAT GAG ACA CCA GGG ATT AGA TAT CAG 

3467 TIPSINNETPGIRYQ pHDMHgpm2 . s eq 

34 67 ACC ATC CCC TCC ATC AAC AAC GAG ACC CCC GGC ATC CGC TAC CAG 

1 1 

2990 3020 
i I 

2987 YNVL P Q GW K G S PAI F ML4-3 genbank. SEQ 

2987 TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

2985 YNVLPQGWKGSPAIF pNL4-3.seq 

2985 TAC AAT GTG CTT CCA CAG GGA TGG AAA GGA TCA CCA GCA ATA TTC 

3512 YNVLPQGWKGSPAIF pHDMHgpm2 . seq 

3512 TAC AAC GTG CTG CCC CAG GGC TGG AAG GGC TCC CCC GCC ATC TTC 

1 

3050 

1 

3032 QCSMTKILEPFRKQN NL4-3 genbank . SEQ 

3032 CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

3030 QCSMTKILEPFRKQN pNL4-3.seq 

3030 CAG TGT AGC ATG ACA AAA ATC TTA GAG CCT TTT AGA AAA CAA AAT 

3557 QCSMTKILEPFRKQN pHDMHgpm2 . seq 

3557 CAG TGC TCC ATG ACC AAG ATC CTG GAG CCC TTC CGC AAG CAG AAC 

i 1 

3080 3110 

I i 

3077 PDIVI YQYMDDLYVG NL4-3 genbank. SEQ 

3077 CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

3075 PDIVIYQYMDDLYVG pNL4-3.seq 

3075 CCA GAC ATA GTC ATC TAT CAA TAC ATG GAT GAT TTG TAT GTA GGA 

3602 PDIVIYQYMDDLYVG pHDMHgpm2 . seq 

3602 CCC GAC ATC GTG ATC TAC CAG TAC ATG GAC GAC CTG TAC GTG GGC 

j 

3140 

I 



3122 SDLEIGQHRTKIEEL NL4-3 genbank. SEQ 

3122 TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

3120 SDLEIGQHRTKIEEL pNL4-3.seq 

3120 TCT GAC TTA GAA ATA GGG CAG CAT AGA ACA AAA ATA GAG GAA CTG 

3647 SDLEIGQHRTKIEEL pHDMHgpm2 . seq 

3647 TCC GAC CTG GAG ATC GGC CAG CAC CGC ACC AAG ATC GAG GAG CTG 



Fig, 9D 



Alignment Report of Codon Optimization (poi).MEG, using Clustal method with PAM250 residue weight table. 



3170 



3200 



3167 RQHLLRWGFTTPDKK 

3167 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 

3165 RQHLLRWGFTTPDKK 

3165 AGA CAA CAT CTG TTG AGG TGG GGA TTT ACC ACA CCA GAC AAA AAA 

3692 RQHLLRWGFTTPDKK 

3692 CGC GAG CAC CTG CTG CGC TGG GGC TTC ACC ACC CCC GAC AAG AAG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



3212 
3212 
3210 
3210 
3737 



3257 
3257 
3255 
3255 
3782 
3782 



3302 
3302 
3300 
3300 
3827 
3827 



3347 
3347 
3345 
3345 
3872 
3872 



3230 



H 



K 



W 



M 



H NL4-3 genbank.SEQ 



CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 



H 



K 



W 



M 



H pNL4-3.seq 



CAT CAG AAA GAA CCT CCA TTC CTT TGG ATG GGT TAT GAA CTC CAT 



H 



K 



W 



M 



E 



H pHDMHgpm2 - seq 



3737 CAC CAG AAG GAG CCC CCC TTC CTG TGG ATG GGC TAC GAG CTG CAC 



1 

3260 



3290 



D 



K 



W 



V 



Q 



V 



E 



K 



D 



CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 



D 



K 



W 



V 



V 



E 



K 



D 



CCT GAT AAA TGG ACA GTA CAG CCT ATA GTG CTG CCA GAA AAG GAC 



D 



K 



W 



V 



V 



K 



D 



CCC GAC AAG TGG ACC GTG CAG CCC ATC GTG CTG CCC GAG AAG GAC 

1 

3320 

SWTVNDIQKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVKD I QKLVGKLN 
AGC TGG ACT GTC AAT GAC ATA CAG AAA TTA GTG GGA AAA TTG AAT 

SWTVNDI QKLVGKLN 
TCC TGG ACC GTG AAC GAC ATC CAG AAG CTG GTG GGC AAG CTG AAC 



j 

3350 
i 



3380 



A 



A 



K 



V 



TGG GCA AGT CAG ATT TAT GGA GGG ATT AAA GTA AGG CAA TTA TGT 



W 



A 



A 



K 



V 



R 



TGG GCA AGT CAG ATT TAT GCA GGG ATT AAA GTA AGG CAA TTA TGT 



W 



A 



K 



V 



R 



Q 



TGG GCC TCC CAG ATC TAC GCC GGC ATC AAA GTC CGC CAG CTG TGC 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 



NL4-3 genbank.SEQ 
pNL4-3. seq 
pHDMHgpm2 . seq 



3392 
3392 
3390 
3390 
3917 
3917 



3410 



KLLRGTKALTEVVPL 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 

KLLRGTKALTEVVPL 
AAA CTT CTT AGG GGA ACC AAA GCA CTA ACA GAA GTA GTA CCA CTA 

KLLRGTKALTEVVPL 
AAG CTG CTG CGC GGC ACC AAG GCC CTG ACC GAG GTG GTG CCC CTG 



NL4-3 genbank.SEQ 
pNL4-3 . seq 
pHDMHgpm2 . seq 



Fig. 9E 



Alignment Report of Codon Optimization (pol).MEG, using Clustai method with PAM250 residue weight table. 



1 1 _^_= 

3440 ' 3470 
I I 

3437 TEEAELELAENREIL NL4-3 genbank.SEQ 

3437 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3435 TEEAELELAENREIL pNL4-3.seq 

3435 ACA GAA GAA GCA GAG CTA GAA CTG GCA GAA AAC AGG GAG ATT CTA 

3962 TEEAELELAENREIL pHDMHgpm2 .seq 

3962 ACC GAG GAG GCC GAG CTG GAG CTG GCC GAG AAC CGC GAG ATC CTG 

I — — 

3500 

1 

3482 KEPVHGVYY*DPSKDL NL4-3 genbank.SEQ 

3482 AAA GAA CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

3480 KEPVHGVYYDPSKDL pNL4-3.seq 

3480 AAA GAA. CCG GTA CAT GGA GTG TAT TAT GAC CCA TCA AAA GAC TTA 

4007 KEPVHGVYYDPSKDL pHDMHgpm2 . seq 
4007 AAG GAG CCC GTG CAC GGC GTG TAC TAC GAC CCC TCC AAG GAC CTG 

1 1 

3530 3560 
i i 

3527 IAEIQKQGQGQWTYQ NL4-3 genbank.SEQ 

3527 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 

3525 IAEIQKQGQGQWTYQ pNL4-3.seq 

3525 ATA GCA GAA ATA CAG AAG CAG GGG CAA GGC CAA TGG ACA TAT CAA 

4052 IAEIQKQGQGQWTYQ pHDMHgpm2 . seq 

4052 ATC GCC GAG ATC CAG AAG CAG GGC CAG GGC CAG TGG ACC TAC CAG 

1 _ 

3590 

1 

3572 IYQEP FKNLKTGKYA NL4-3 genbank.SEQ 

3572 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

3570 IYQEP FKNLKTGKYA pNL4-3.seq 

3570 ATT TAT CAA GAG CCA TTT AAA AAT CTG AAA ACA GGA AAA TAT GCA 

4097 IYQEP FKNLKTGKYA pHDMHgpm2 . seq 

4097 ATC TAC CAG GAG CCC TTC AAG AAC CTG AAG ACC GGC AAA TAC GCC 



3620 3650 
I I 

3617 RMKGAHTNDVKQLTE NL4-3 genbank.SEQ 

3617 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 

3615 RMKGAHTNDVKQLTE pNL4-3.seq 

3615 AGA ATG AAG GGT GCC CAC ACT AAT GAT GTG AAA CAA TTA ACA GAG 

4142 RMKGAHTNDVKQLTE pHDMHgpm2 . s eq 

4142 CGC ATG AAG GGC GCC CAC ACC AAC GAC GTG AAG CAG CTG ACC GAG 

1 

3680 

i 

3662 AVQKIATESIVIWGK NL4-3 genbank.SEQ 

3662 GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

3660 AVQKIATESIVIWGK pNL4-3.seq 

3660 GCA GTA CAA AAA ATA GCC ACA GAA AGC ATA GTA ATA TGG GGA AAG 

4187 AVQKIATESIVIWGK pHDMHgpm2 . s eq 
4187 GCC GTG CAG AAG ATC GCC ACC GAG TCC ATC GTG ATC TGG GGC AAG 



Fig. 9F 



Alignment Report of Codon Optimization (pol).MEG, using Clustal method with PAM250 residue weight table. 





3710 


















3740 
i 








3707 


T P 


K 


F 


K 


L 


P 


I 


Q 


K 


E T 


W 


E 


A 


3707 


ACT CCT 


AAA 


TTT 


AAA 


TTA 


CCC 


ATA 


CAA 


AAG 


GAA ACA 


TGG 


GAA 


GCA 


3705 


T P 


K 


F 


K 


L 


P 


I 


Q 


K 


E T 


W 


E 


A 


3705 


ACT CCT 


AAA 


TTT 


AAA 


TTA 


CCC 


ATA 


CAA 


AAG 


GAA ACA 


TGG 


GAA 


GCA 


4232 


T P 


K 


F 


K 


L 


p 


I 


Q 


K 


E T 


W 


E 


A 



4232 ACT CCC AAG TTC AAG CTG CCC ATC CAG AAG GAG ACC TGG GAG GCC 



NL4-3 genbank.SEQ 
pNL4-3.seq 
pHDMHgpm2 . seq 



! 

3770 



3752 WWTEYWQATWIPEWE NL4-3 genbank.SEQ 

3752 TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

3750 WWTEYWQATWIPEWE pNL4-3.seq 

3750 TGG TGG ACA GAG TAT TGG CAA GCC ACC TGG ATT CCT GAG TGG GAG 

4277 WWTEYWQATWIPEWE P HDMHgpm2 . seq 

4277 TGG TGG ACC GAG TAC TGG CAG GCC ACC TGG ATC CCC GAG TGG GAG 



1 — 1 — 

3800 3830 



3797 FVNTPPLVKLWYQLE NL4-3 genbank.SEQ 

3797 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

3795 FVNTPPLVKLWYQLE pNL4-3.seq 

3795 TTT GTC AAT ACC CCT CCC TTA GTG AAG TTA TGG TAC CAG TTA GAG 

4322 FVNTPPLVKLWYQLE pHDMHgpm2 . s eq 

4322 TTC GTG AAC ACC CCC CCC CTG GTG AAG CTG TGG TAC CAG CTG GAG 

3860 



3842 
3842 


K 
AAA 


E 
GAA 


P 
CCC 


I 

ATA 


I 

ATA 


G 
GGA 


< 

A 
GCA 


E 
GAA 


T 
ACT 


F 
TTC 


Y 
TAT 


V 
GTA 


D 
GAT 


G 
GGG 


A 
GCA 


NL4-3 genbank. 


. SEQ 


3840 
3840 


K 
.A^\A. 


E 
GAA 


P 
CCC 


I 

ATA 


I 

ATA 


G 
GGA 


A 
GCA 


E 
GAA 


T 

ACT 


F 
TTC 


Y 
TAT 


V 
GTA 


D 
GAT 


G 
GGG 


A 
GCA 


pNL4-3. seq 




4367 
4367 


K 
AAG 


E 

GAG 


p 

CCC 


I 
ATC 


I 
ATC 


G 
GGC 


A 
GCC 


E 
GAG 


T 
ACC 


F 
TTC 


Y 
TAC 


V 
GTG 


D 
GAC 


G 
GGC 


A 
GCC 


pHDMHgpm2 .seq 






i 

3890 


















I 

3920 
i 












3887 
3887 


A 
GCC 
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AAT 
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AGG 
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GAA 


T 
ACT 


K 
AAA 
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TTA 


G 
GGA 


K 
AAA 


A 
GCA 


G 
GGA 
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TAT 


V 
GTA 


T 
ACT 


D 
GAC 


NL4-3 genbank. 


SEQ 


3885 
3885 


A 
GCC 


N 
AAT 


R 
AGG 


E 
GAA 


T 

ACT 


K 
AAA 


L 
TTA 


G 
GGA 


K 
AAA 


A 
GCA 


G 
GGA 


Y 
TAT 


V 
GTA 


T 
ACT 


D 
GAC 


pNL4-3 . seq 




4412 
4412 


A 
GCC 


N 
AAC 


R 
CGC 


E 
GAG 


T 
ACC 


K 
AAG 


L 
CTG 


G 
GGC 


K 
AAG 


A 
GCC 


G 
GGC 


Y 
TAC 


V 
GTG 


T 

ACC 


D 
GAC 


pHDMHgpm2 . seq 
















i 

3950 
i 






















3932 
3932 


R 
AGA 


G 
GGA 


R 
AGA 


Q 
CAA 


K 


V 
GTT 


V 
GTC 


P 
CCC 


L 
CTA 


T 
ACG 


D 
GAC 


T 
ACA 


T 
ACA 


N 
AAT 


Q 
CAG 


NL4-3 genbank. 


SEQ 


3930 
3930 


R 
AGA 


G 
GGA 


R 
AGA 


Q 
CAA 


K 


V 
GTT 


V 
GTC 


P 

CCC 


L 
CTA 


T 
ACG 


D 
GAC 


T 
ACA 


T 
ACA 


N 
AAT 


Q 
CAG 


pNL4-3 . seq 




4457 
4457 


R 

CGC 


G 
GGC 


R 
CGC 


Q 

CAG 


K 
AAG 


V 
GTG 


V 
GTG 


P 
CCC 


L 
CTG 


T 
ACC 


D 
GAC 


T 
ACC 


T 
ACC 


N 
AAC 


Q 
CAG 


pHDMHgpm2 . seq 
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Alignment Report of Codon Optimization (pol).MEG, using Ciustai method with PAM250 residue weight table. 



1 1 == 

3980 4010 

£ I 

3977 KTELQAI HLALQDS G NL4-3 genbank.SEQ 

3977 AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

3975 KTELQAI HLALQDS G pNL4-3.seq 

3975 AAG ACT GAG TTA CAA GCA ATT CAT CTA GCT TTG CAG GAT TCG GGA 

4502 KTELQAI HLALQDSG pHDMHgpm2 . seq 

4502 AAG ACC GAG CTG CAG GCC ATC CAC CTG GCC CTG CAA GAC TCC GGC 

1 

4040 

i 

4022 LEVN IVT D S QYALGI NL4-3 genbank.SEQ 

4022 TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

4020 LEVNIVTDSQYALGI pNL4-3.seq 

4020 TTA GAA GTA AAC ATA GTG ACA GAC TCA CAA TAT GCA TTG GGA ATC 

4547 LEVNIVTDSQYALGI pHDMHgpni2 . seq 
4547 CTG GAG GTG AAC ATC GTG ACC GAC TCC CAG TAT GCA TTG GGC ATC 

1 1 

4070 4100 
I I 

4067 IQAQPDKSESELVSQ NL4-3 genbank.SEQ 

4067 ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 

4065 IQAQPDKSESELVSQ pNL4-3.seq 

4065 ATT CAA GCA CAA CCA GAT AAG AGT GAA TCA GAG TTA GTC AGT CAA 

4592 IQAQPDKSESELVSQ pHDMHgpm2 . seq 

4592 ATC CAG GCC CAG CCC GAC AAG TCC GAG TCC GAG CTG GTG TCC CAG 

1 

4130 

i 

4112 IIEQLI KKEKVYLAW NL4-3 genbank.SEQ 

4112 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 

4110 IIEQLI KKEKVYLAW pNL4-3 . seq 

4110 ATA ATA GAG CAG TTA ATA AAA AAG GAA AAA GTC TAC CTG GCA TGG 

4 637 IIEQLI KKEKVYLAW pHDMHgpm2 . seq 

4637 ATC ATC GAG CAG CTG ATC AAG AAG GAG AAG GTG TAC CTG GCC TGG 

1 1 

4160 4190 
I I 

4157 VPAHKGI GGNEQVD \G- NL4-3 genbank.SEQ 

4157 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT GGG 

4155 VPAHKGI GGNEQVDK pNL4-3.seq 

4155 GTA CCA GCA CAC AAA GGA ATT GGA GGA AAT GAA CAA GTA GAT AAG 

4682 VPAHKGI GGNEQVDK pHDMHgpm2 . seq 

4682 GTG CCC GCC CAC AAG GGC ATC GGC GGC AAC GAG CAG GTG GAC AAG 

1 

4220 

i 

4202 LVSAGI RKVLFLDGI NL4-3 genbank.SEQ 

4202 TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 

4200 LVSAGIRKVLFLDGI pNL4-3.seq 

4200 TTG GTC AGT GCT GGA ATC AGG AAA GTA CTA TTT TTA GAT GGA ATA 

4727 LVSAGIRKVLFLDGI pHDMHgpm2 . seq 
4727 CTG GTG TCC GCC GGC ATC CGC AAG GTG CTG TTC CTG GAC GGC ATC 



Fig. 9H 



Alignment Report of Codon Optimization (pol).MEG, using Clustai method with PAM250 residue weight table. 



i . 

4250 4280 

» 1 

4247 DKAQEEHEKYHSNWR NL4-3 genbank. SEQ 

4247 GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 

4245 DKAQEEHEKYHSNWR pNL4-3.seq 

4245 GAT AAG GCC CAA GAA GAA CAT GAG AAA TAT CAC AGT AAT TGG AGA 

4772 DKAQEEHEKYHSNWR pHDMHgpm2 . seq 

4772 GAC AAG GCC CAG GAG GAG CAC GAG AAG TAC CAC TCC AAC TGG CGC 

. _ 1 ■ 

4310 

I 

4292 AMAS DFNLP PVVAKE NL4-3 genbank. SEQ 

4292 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 

4290 AMAS D FN L P PVVAKE pNL4-3.seq 

4290 GCA ATG GCT AGT GAT TTT AAC CTA CCA CCT GTA GTA GCA AAA GAA 

4817 AMAS D F N L P PVVAKE pHDMHgpm2 . s eq 

4817 GCC ATG GCC TCC GAC TTC AAC CTG CCC CCC GTG GTG GCC AAG GAG 

1 _ 1 

4340 4370 
L_ — 1 

4337 IVASCDKCQLKGEAM NL4-3 genbank. SEQ 

4337 ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

4335 IVASCDKCQLKGEAM pNL4-3.seq 

4335 ATA GTA GCC AGC TGT GAT AAA TGT CAG CTA AAA GGG GAA GCC ATG 

4862 IVASCDKCQLKGEAM pHDMHgpm2 . seq 

4862 ATC GTG GCC TCC TGC GAC AAG TGC CAG CTG AAG GGC GAG GCC ATG 

— I — ~ 

4400 

i - , 

4382 HGQVDCSPGIWQLDC NL4-3 genbank. SEQ 

4382 CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 

4380 HGQVDCSPGIWQLDC pNL4-3.seq 

4380 CAT GGA CAA GTA GAC TGT AGC CCA GGA ATA TGG CAG CTA GAT TGT 

4907 HGQVDCSPGIWQLDC pHDMHgpm2 . seq 

4907 CAC GGC CAG GTG GAC TGC TCC CCC GGC ATC TGG CAG CTG GAC TGC 



T t 

4430 4460 



4427 THLEGKVILVAVHVA NL4-3 genbank. SEQ 

4427 ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 
4425 THLEGKVI L V A V H V A pNL4-3.seq 

4 425 ACA CAT TTA GAA GGA AAA GTT ATC TTG GTA GCA GTT CAT GTA GCC 

4952 THLEGKVI L V A V H V A pHDMHgpm2 . seq 

4 952 ACC CAC CTG GAG GGC AAG GTG ATC CTG GTG GCC GTG CAC GTG GCC 

. — 1 ~ 

4490 



X 



4472 SGYIEAEVI PAETGQ NL4-3 genbank. SEQ 

4472 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4470 s GYI EAEVI PAETGQ pNL4-3.seq 

4470 AGT GGA TAT ATA GAA GCA GAA GTA ATT CCA GCA GAG ACA GGG CAA 

4997 s GYI EAEVI PAETGQ pHDMHgpm2 . seq 

4997 TCC GGC TAC ATC GAG GCC GAG GTG ATC CCC GCC GAG ACC GGC CAG 
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Alignment Report of Codon Optimization (poi).MEG, using Ciustai method with PAM250 residue weight table. 



i 

4520 4550 

i 1 

4517 ETAYFLLKLAGRWPV NL4-3 genbank.SEQ 

4517 GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

4515 ETAYFLLKLAGRWPV pNL4-3.seq 

4515 GAA ACA GCA TAC TTC CTC TTA AAA TTA GCA GGA AGA TGG CCA GTA 

5042 ETAYFLLKLAGRWPV pHDMHgpm2 . seq 

5042 GAG ACC GCC TAC TTC CTG CTG AAG CTG GCC GGC CGC TGG CCC GTG 



1 

4580 



4562 KTVHTDNGSNFTSTT NL4-3 genbank.SEQ 

4562 AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

4560 KTVHTDNGSNFTSTT pNL4-3.seq 

4560 AAA ACA GTA CAT ACA GAC AAT GGC AGC AAT TTC ACC AGT ACT ACA 

5087KTVHT D N G S N FT S T T pHDMHgpm2 . seq 
5087 AAG ACC GTG CAC ACC GAC AAC GGC TCC AAC TTC ACC TCC ACC ACC 



r ~ 1 

4610 4640 
I 1 

4607 VKAACWWAGI KQEFG NL4-3 genbank.SEQ 

4 607 GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

4605 VKAACWWAGI KQEFG pNL4-3.seq 

4605 GTT AAG GCC GCC TGT TGG TGG GCG GGG ATC AAG CAG GAA TTT GGC 

5132 VKAACWWAGI KQEFG pHDMHgpm2 . seq 

5132 GTG AAG GCC GCC TGC TGG TGG GCC GGC ATC AAG CAG GAG TTC GGC 



1 

4670 



4652 IPYNPQSQGVIESMN NL4-3 genbank^SEQ 

4652 ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 

4650 IPYNPQSQGVIESMN pNL4-3.seq 

4650 ATT CCC TAC AAT CCC CAA AGT CAA GGA GTA ATA GAA TCT ATG AAT 

5X77 IPYNPQSQGVIESMN pHDMHgpm2 . seq 

5177 ATC CCC TAC AAC CCC CAG TCC CAG GGC GTG ATC GAG TCC ATG AAC 

, — , 

4700 4730 



4697 KELKKI I GQVRDQAE NL4-3 genbank.SEQ 

4 697 AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 

4 695 KELKKI I GQVRDQAE pNL4-3.seq 

4695 AAA GAA TTA AAG AAA ATT ATA GGA CAG GTA AGA GAT CAG GCT GAA 

5222 KELKKI I GQVRDQAE pHDMHgpm2 . s eq 

5222 AAG GAG CTG AAG AAG ATC ATC GGC CAA GTC CGC GAC CAG GCC GAG 



1 

4760 

_L 



4742 HLKTAVQMAVFIHNF NL4-3 genbank.SEQ 

4742 CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 

4740 HLKTAVQMAVFIHNF pNL4-3.seq 

4740 CAT CTT AAG ACA GCA GTA CAA ATG GCA GTA TTC ATC CAC AAT TTT 

5267 HLKTAVQMAVFIHNF pHDMHgpm2 . seq 

5267 CAC CTG AAG ACC GCC GTG CAG ATG GCC GTG TTC ATC CAC AAC TTC 
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Alignment Report of Codon Optimization (poiJ.MEG, using Clustal method with PAM250 residue weight table. 



I ! — 1 

4790 4820 

. — l I 

4787 KRKGGI GGYSAGERI NL4-3 genbank.SEQ 

4787 AAA AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA AGA ATA 

4785 KRKGGI GGYSAGERI pNL4-3.seq 

4785 AAA AGA AAA GGG GGG ATT GGG GGG TAC AGT GCA GGG GAA AGA ATA 

5312 KRKGGI GGYSAGERI pHDMHgpm2 . s eq 

5312 AAG CGC AAG GGC GGC ATC GGC GGC TAC TCC GCC GGC GAG CGC ATC 

I — 

4850 

1 „ 

4832 VDIIATDIQTKELQK NL4-3 genbank.SEQ 

4832 GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

4830 VDIIATDIQTKELQK pNL4-3.seq 

4830 GTA GAC ATA ATA GCA ACA GAC ATA CAA ACT AAA GAA TTA CAA AAA 

5357 VDIIATDIQTKELQK pHDMHgpm2 . s eq 
5357 GTG GAC ATC ATC GCC ACC GAC ATC CAG ACC AAG GAG CTG CAG AAG 

1 , — 

4880 4910 

} ! 

4877 QITKI QNFRVYYRDS NL4-3 genbank.SEQ 

4877 CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

4875 QITKIQNFRVYYRDS pNL4-3.seq 

4875 CAA ATT ACA AAA ATT CAA AAT TTT CGG GTT TAT TAC AGG GAC AGC 

5402 QITKIQNFRVYYRDS pHDMHgpm2 . s eq 

54 02 CAG ATC ACC AAG ATC CAG AAC TTC CGC GTG TAC TAC CGC GAC TCC 

i 

4940 

1 „ 

4922 RDPVWKGPAKLLWKG NL4-3 genbank.SEQ 

4922 AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

4920 RDPVWKGPAKLLWKG pNL4-3.seq 

4 920 AGA GAT CCA GTT TGG AAA GGA CCA GCA AAG CTC CTC TGG AAA GGT 

5447 RDPVWKGPAKLLWKG pHDMHgpm2 . seq 
5447 CGC GAC CCC GTG TGG AAG GGC CCC GCC AAG CTG CTG TGG AAG GGC 

1 1 

4970 5000 
I i 

4 967 EGAVVI QDNS DI KVV NL4-3 genbank.SEQ 

4 967 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

4965 EGAVVIQDNSDIKVV pNL4~3.seq 

4965 GAA GGG GCA GTA GTA ATA CAA GAT AAT AGT GAC ATA AAA GTA GTG 

5492 EGAVVIQDNSDIKVV pHDMHgpm2 . seq 

5492 GAG GGC GCC GTG GTG ATC CAG GAC AAC TCC GAC ATC AAG GTG GTG 

T 

5030 

,, i , , 

5012 PRRKAKIIRDYGKQM NL4-3 genbank.SEQ 

5012 CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

5010 PRRKAKIIRDYGKQM pNL4-3.seq 

5010 CCA AGA AGA AAA GCA AAG ATC ATC AGG GAT TAT GGA AAA CAG ATG 

5537 PRRKAKIIRDYGKQM pHDMHgpm2 . seq 
5537 CCC CGC CGC AAG GCC AAG ATC ATC CGC GAC TAC GGC AAG CAG ATG 
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Alignment Report of Codon Optimization (poi).MEG, using Clustal method with PAM250 residue weight table. 



5057 
5057 
5055 
5055 
5582 
5582 



— i — 

5060 
i 



— i — 

5090 
i 



AGDDCVASRQDED 
GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT TAA 
AGDDCVASRQDED 



GCA GGT GAT GAT TGT GTG GCA AGT AGA CAG GAT GAG GAT 

AGDDCVASRQDED 
GCC GGC GAC GAC TGC GTG GCC TCC CGC CAG GAC GAG GAC 



^EA2L 



NL4-3 genbank.SEQ 



pNL4-3. seq 
pHDMHgpm2 . seq 
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AGCTTGGCCC ATTGCATACG TTGTATCCAT ATCATAATAT GTACATTTAT ATTGGCTCAT 60 

GTCCAACATT ACCGCCATG? TGACATTGAT T ATTGAC TAG TTATTAATAG TAATCAATTA 120 

CGGGGTCATT AGTTCATAGC CCATATATGG AGTTCCGCGT TACATAACTT AC GGTAAATG 180 

GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTGAC GTCAATAATG AC GTATGTTC 240 

CCATAGTAAC GCCAATAGGG ACTTTCCATT GACGTCAATG GGTGGAGTAT TTACGGTAAA 300 

CTGCCCACTT GGCAGTACAT CAAGTGTATC ATATGCCAAG TACGCCCCCT ATTGACGTCA 360 

ATGACGGTAA ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG GACTTTCCTA 420 

CTTGGCAGTA CATC TACGTA TTAGTCATCG CTATTACCAT GGTGATGCGG TTTTGGCAGT 480 

ACATCAATGG GCGTGGATAG CGGTTTGACT CACGGGGATT TCCAAGTCTC CACCCCATTG 540 

ACGTCAATGG GAGTTTGTTT TGGCACCAAA ATCAACGGGA CTTTCCAAAA TGTCGTAACA 600 

ACTCCGCCCC ATTGAC GC AA ATGGGCGGTA GGCGTGTACG GTGGGAGGTC TATATAAGCA 660 

GAGCTCGTTT AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT TTTGACCTCC 72 0 

ATAGAAGACA CCGGGACCGA TCCAGCCTCC CCTCGAAGCT GATCCTGAGA ACTTCAGGGT 780 

GAGTCTATGG GACCCTTGAT GTTTTCTTTC CCCTTCTTTT CTATGGTTAA GTTCATGTCA 840 

TAGGAAGGGG AGAAGTAACA GGGTACACAT ATTGACCAAA TCAGGGTAAT TTTGCATTTG 900 

TAATTTTAAA AAATGCTTTC TTCTTTTAAT ATACTTTTTT GTTTATCTTA TTTCTAATAC 960 

TTTCCCTAAT CTCTTTCTTT CAGGGCAATA ATGATACAAT GTATCATGCC TCTTTGCACC 102 0 

ATTC TAAAGA ATAACAGTGA TAATTTCTGG GTTAAGGCAA TAGCAATATT TCTGCATATA 1080 

3 AATATTTCTG CATATAAATT GTAACTGATG TAAGAGGTTT CATATTGCTA ATAGCAGCTA 1140 

f% CAATCCAGCT ACCATTCTGC TTTTATTTTA TGGTTGGGAT AAGGCTGGAT TATTCTGAGT 120 0 

^ CCAAGCTAGG CCCTTTTGCT AATCATGTTC ATACCTCTTA TCTTCCTCCC ACAGCTCCTG 1260 

J GGCAACGTGC TGGTCTGTGT GCTGGCC CAT CACTTTGGCA AAGAATTCTA GACTGCCATG 132 0 

u , GGCGCCCGCG CCTCCGTGCT GTCCGGCGGC GAGCTGGACA AGTGGGAGAA GATCCGCCTG 1380 

ti CGCCCCGGCG GCAAGAAGCA GTACAAGCTG AAGCACATCG TGTGGGCCTC CCGCGAGCTG 1440 

GAGCGCTTCG CCGTGAACCC CGGCCTGCTG GAGACCTCCG AGGGCTGCCG CCAGATCCTG 1500 

GGCCAGCTGC AGCCCTCCCT GCAAAC CGGC TCC GAGGAGC TGCGCTCCCT GTACAACACC 1560 

ATCGCCGTGC TGTACTGCGT GCACCAGCGC ATCGACGTGA AGGACACCAA GGAGGCCCTG 162 0 

GACAAGATCG AGGAGGAGCA GAACAAGTCC AAGAAGAAGG CCCAGCAGGC CGCCGCCGAC 1680 

ACCGGCAACA ACTCCCAGGT GTCCCAGAAC TACCCCATCG TGCAGAACCT GCAGGGCCAG 1740 

ATGGTGCACC AGGCCATCTC CCCCCGCACC CTGAACGCCT GGGTGAAGGT GGTGGAGGAG 1800 

AAGGCCTTCT CCCCCGAAGT CATCCCCATG TTCTCCGCCC TGTCCGAGGG CGCCACCCCC 1860 

^ CAGGACCTGA ACACCATGCT GAACACCGTG GGCGGCCACC AGGCCGCCAT GCAGATGCTG 1920 

<*? AAGGAGACCA TCAACGAGGA GGCCGCCGAG TGGGACCGCC TGCACCCCGT GCACGCCGGC 1980 

&f CCCATCGCCC CCGGCCAGAT GCGCGAGCCC CGCGGCTCCG ACATCGCCGG CACCACCTCC 2040 

H ACCCTGCAAG AGCAGATCGG CTGGATGACC CACAACCCCC CCATCCCCGT GGGCGAGATC 2100 

TACAAGCGCT GGATCATCCT GGGCCTGAAC AAGATCGTGC GCATGTACTC CCCCACCTCC 2160 

ATCCTGGACA TCCGCCAGGG CCCCAAGGAG CCCTTCCGCG ACTACGTGGA CCGCTTCTAC 2220 

AAGACCCTGC GCGCCGAGCA GGCCTCCCAG GAGGTAAAGA ACTGGATGAC CGAGACCCTG 22 80 

CTGGTGCAGA ACGCCAACCC CGACTGCAAG ACCATCCTGA AGGCCCTGGG CCCCGGCGCC 2340 

ACCCTGGAGG AGATGATGAC CGCCTGCCAG GGCGTGGGCG GCCCCGGCCA CAAGGCCCGC 2 40 0 

GTGCTGGCCG AGGCCATGTC CCAAGTCACC AACCCCGCCA C CATC ATGAT CCAGAAGGGC 2460 

AACTTCCGCA ACCAGCGCAA GACCGTGAAG TGCTTCAACT GCGGCAAGGA GGGC CACATC 2520 

GCCAAGAACT GCCGCGCCCC CCGCAAGAAG GGCTGCTGGA AGTGCGGCAA GGAGGGCCAC 2580 

CAGATGAAAG ATTGTACTGA GAGACAGGCT AATTTTTTAG GGAAGATCTG GCCTTCCCAC 2640 

AAGGGAAGGC CAGGGAATTT TCTTCAGAGC AG AC C AG AGC CAACAGCCCC ACCAGAAGAG 27 00 

AGC TTCAGGT TTGGGGAAGA GACAACAACT CCCTCTCAGA AGCAGGAGCC GATAGACAAG 27 60 

GAACTGTATC CTTTAGCTTC CCTCAGATCA CTCTTTGGCA GCGACCCCTC GTCACAATAA 282 0 
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AGATCGGTGG CCAGCTGAAG GAGGCCCTGC 
AGGAGATGAA CCTGCCCGGC CGCTGGAAGC 
TCAAAGTCCG CCAGTACGAC CAGATCCTGA 
CCGTGCTGGT GGGCCCCACC CCCGTGAACA 
GCTGCACCCT GAACTTCCCC ATCTCCCCCA 
GCATGGACGG CCCCAAAGTC AAGCAGTGGC 
TGGAGATCTG CACCGAGATG GAGAAGGAGG 
CCTACAACAC CCCCGTGTTC GCCATCAAGA 
TGGACTTCCG CGAGCTGAAC AAGCGCACCC 
CCCACCCCGC CGGCCTGAAG CAGAAGAAGT 
ACTTCTCCGT GCCCCTGGAC AAGGACTTCC 
TCAACAACGA GACCCCCGGC ATCCGCTACC 
GCTCCCCCGC CATCTTCCAG TGCTCCATGA 
ACCCCGACAT CGTGATC TAC CAGTACATGG 
TCGGCCAGCA CCGCACCAAG ATCGAGGAGC 
CCACCCCCGA CAAGAAGCAC CAGAAGGAGC 
ACCCCGACAA GTGGACCGTG CAGCCCATCG 
ACGACATCCA GAAGCTGGTG GGCAAGCTGA 
sSH AAGTCCGCCA GCTGTGCAAG CTGCTGCGCG 

Ja 2 TGACCGAGGA GGCCGAGCTG GAGCTGGCCG 

ACGGCGTGTA CTACGACCCC TCCAAGGACC 
■M GCCAGTGGAC CTACCAGATC TACCAGGAGC 

HO CCCGCATGAA GGGCGCCCAC ACCAACGACG 

Uj TCGCCACCGA GTCCATCGTG ATCTGGGGCA 

AGGAGACCTG GGAGGCCTGG TGGACCGAGT 
5 .ri AGTTCGTGAA CACCCCCCCC CTGGTGAAGC 

IP TCGGCGCCGA GACCTTCTAC GTGGACGGCG 

'y CCGGCTACGT GACCGACCGC GGCCGCCAGA 

^ AGAAGACCGA GCTGCAGGCC ATCCACCTGG 

is jf TCGTGACCGA CTCCCAGTAT GC ATTGGGC A 

V CCGAGCTGGT GTC CCAGATC ATCGAGCAGC 

M* GGGTGCCCGC CCACAAGGGC ATCGGCGGCA 

Q GCATCCGCAA GGTGCTGTTC CTGGACGGCA 

;|J ACCACTCCAA CTGGCGCGCC ATGGCCTCCG 

:J| AGATC GTGGC CTCCTGCGAC AAGTGCCAGC 

ACTGCTCCCC CGGCATCTGG CAGCTGGACT 
TGGCCGTGCA CGTGGCCTCC GGCTACATCG 
AGGAGACCGC CTACTTCCTG CTGAAGCTGG 
CCGACAACGG CTCCAACTTC ACCTCCACCA 
TCAAGCAGGA GTTCGGCATC CCCTACAACC 
ACAAGGAGCT GAAGAAGATC ATCGGCCAAG 
CCGTGCAGAT GGCCGTGTTC ATCCACAACT 
CCGCCGGCGA GCGCATCGTG GACATCATCG 
AGCAGATCAC CAAGATCCAG AACTTCCGCG 
GGAAGGGCCC CGCCAAGCTG CTGTGGAAGG 
CCGACATCAA GGTGGTGCCC CGCCGCAAGG 
TGGCCGGCGA CGACTGCGTG GCCTCCCGCC 



TGGACACCGG CGCCGACGAC ACC GTGCTGG 2880 

CCAAGATGAT CGGCGGCATC GGCGGCTTCA '2940 

TGGAGATCTG CGGCCACAAG GCCATCGGCA 3000 

TCATC GGCCG C AAC CTGCTG ACCCAGATCG 3060 

TCGAGACCGT GCCCGTGAAG CTGAAGCCCG 3120 

CCCTGACCGA GGAGAAGATC AAGGCCCTGG 3180 

GCAAGATCTC CAAGATCGGC CCCGAGAACC 3240 

AGAAGGACTC CACCAAGTGG CGCAAGCTGG 3300 

AGGACTTCTG GGAGGTGCAG CTGGGCATCC 3360 

CCGTGACCGT GCTGGACGTG GGCGACGCCT 342 0 

GCAAGTACAC CGCCTTCACC ATCCCCTCCA 3480 

AGTACAACGT GCTGCCCCAG GGCTGGAAGG 3540 

CCAAGATCCT GGAGCCCTTC CGCAAGCAGA 3600 

AC GACCTGTA CGTGGGCTCC GACCTGGAGA 3660 

TGCGCCAGCA CCTGCTGCGC TGGGGCTTCA 372 0 

CCCCCTTCCT GTGGATGGGC TACGAGCTGC 378 0 

TGCTGCCCGA GAAGGACTCC TGGACCGTGA 3840 

ACTGGGCCTC CCAGATC TAC GCCGGCATCA 3900 

GC AC C AAGGC CCTGAC C GAG GTGGTGCCCC 3960 

AG AAC CGCGA GATCCTGAAG GAGCCCGTGC 4020 

TGATCGCCGA GATCCAGAAG CAGGGCCAGG 4080 

CCTTCAAGAA CCTGAAGACC GGCAAATACG 4140 

TGAAGCAGCT GACCGAGGCC GTGCAGAAGA 42 00 

AGACTCCCAA GTTCAAGCTG CCCATCCAGA 42 60 

ACTGGCAGGC CACCTGGATC CCCGAGTGGG 4320 

TGTGGTACCA GCTGGAGAAG GAGCCCATCA 4380 

CCGCCAACCG CGAGACCAAG CTGGGCAAGG 4440 

AGGTGGTGCC CCTGACCGAC ACCACCAACC 4500 

CCCTGCAAGA CTCCGGCCTG GAGGTGAACA 4560 

TCATC CAGGC CCAGCCCGAC AAGTCCGAGT 4620 

TGATCAAGAA GGAGAAGGTG TACCTGGCCT 4680 

AC GAGCAGGT GGACAAGCTG GTGTCCGCCG 4740 

TCGACAAGGC CCAGGAGGAG CACGAGAAGT 4800 

ACTTCAACCT GCCCCCCGTG GTGGCCAAGG 4860 

TGAAGGGC GA GGCCATGCAC GGCCAGGTGG 4920 

GCACCCACCT GGAGGGCAAG GTGATCCTGG 4980 

AGGCCGAGGT GATCCCCGCC GAGACCGGCC 5040 

CCGGCCGCTG GCCCGTGAAG ACCGTGCACA 5100 

CCGTGAAGGC CGCCTGCTGG TGGGCCGGCA 5160 

CCCAGTCCCA GGGCGTGATC GAGTC CATGA 5220 

TCCGCGACCA GGCCGAGCAC CTGAAGACCG 528 0 

TCAAGCGCAA GGGCGGCATC GGCGGCTACT 5340 

CCACCGACAT CCAGACCAAG GAGC TGCAGA 5400 

TG TAC TAC CG CGACTCCCGC GACCCCGTGT 5460 

GCGAGGGCGC CGTGGTGATC CAGGACAACT 5520 

CCAAGATCAT CCGCGACTAC GGCAAGCAGA 5580 

AGGAC GAGGA CTAACACATG GAAAAGATTA 564Q 
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GTAAAACACC ATAGGCCGCT CTAGAGGATC 
CCCAGATCTA ATTCACCCCA CCAGTGCAGG 
CTAATGCCCT GGCCCACAAG TATCACTAAG 
AGGTTCCTTT GTTCCCTAAG TCCAACTACT 
CATGTGGATT CTGCCTAATA AAAAACATTT 
TTCTGAATAT TTTACTAAAA AGGGAATGTG 
TGAAGAGCTA GTTCAAACCT TGGGAAAATA 
GAGGCTGCAA ACAGCTAATG CACATTGGCA 
CCTCAGAAAA GGATTCAAGT AGAGGCTTGA 
TTTACATTAC TTATTGTTTT AGCTGTCCTC 
TCCTGCATCT CTCAGCCTTG ACTCCACTCA 
CTGAAGTGTT CCTTCCATGT TTTACGGCGA 
TTAGTTGTCT CTGTTGTCTT ATAGAGGTCT 
TTGACTGTCC TGTGAGCCCT TCTTCCCTGC 
CGACATGGCA GTCTAGATCA TTC TTGAAGA 
TAGGTTAATG TCATGATAAT AATGGTTTCT 
GTGCGCGGAA CCCCTATTTG TTTATTTTTC 
AGACAATAAC CCTGATAAAT GCTTCAATAA 
CATTTCCGTG TCGCCCTTAT TCCCTTTTTT 
CCAGAAACGC TGGTGAAAGT AAAAGATGCT 
ATCGAACTGG ATCTCAACAG CGGTAAGATC 
CCAATGATGA GCACTTTTAA AGTTCTGCTA 
GGGCAAGAGC AACTCGGTCG CCGCATACAC 
CCAGTCACAG AAAAGCATCT TAG GGATGGC 
ATAACCATGA GTGATAACAC TGCGGCCAAC 
GAGCTAACCG CTTTTTTGCA CAACATGGGG 
CCGGAGCTGA ATGAAGCCAT AC C AAAC G AC 
GCAACAACGT TGC GCAAACT ATTAACTGGC 
TTAATAGACT GGATGGAGGC GGATAAAGTT 
GCTGGCTGGT TTATTGCTGA TAAATCTGGA 
GCAGCACTGG GGCCAGATGG TAAGCCCTCC 
CAGGCAACTA TGGATGAACG AAATAGACAG 
CATTGGTAAC TGTCAGACCA AGTTTACTCA 
TTTTAATTTA AAAGGATCTA GGTGAAGATC 
TAACGTGAGT TTTCGTTCCA CTGAGCGTCA 
TGAGATCCTT TTTTTCTGCG CGTAATCTGC 
GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA 
AGCAGAGCGC AGAT AC C AAA TACTGTTCTT 
AAGAACTCTG TAGCACCGCC TACATACCTC 
GCCAGTGGCG ATAAGTC GTG TCTTACCGGG 
GCGCAGCGGT CGGGC TGAAC GGGGGGTTCG 
TACACCGAAC TGAGATACCT AC A.GC GTGAG 
AGAAAGGCGG ACAGGTATCC GGTAAGCGGC 
CTTCCAGGGG GAAACGCCTG GTATCTTTAT 
GAGCGTCGAT TTTTGTGATG CTCGTCAGGG 
GGATGCGCCG CGTGCGGCTG CTGGAGATGG 
TGGTTTGCGC ATTCACAGTT CTCCGCAAGA 



CAAGCTTATC GATACCGTCG ACCTC GAGGG 5700 

CTGCCTATCA GAAAGTGGTG GCTGGTGTGG 5760 

CTCGCTTTCT TGCTGTCCAA TTTCTATTAA 5820 

AAACTGGGGG ATATTATGAA GGGCCTTGAG 58S0 

ATTTTCATTG CAATGATGTA TTTAAATTAT 5940 

GGAGGTCAGT GCATTTAAAA CATAAAGAAA 6000 

CACTATATCT TAAACTCCAT GAAAGAAGGT 6060 

ACAGCCCCTG ATGCCTATGC CTTATTCATC 6120 

TTTGGAGGTT AAAGTTTTGC TATGCTGTAT 6180 

ATGAATGTCT TTTCAC TACC CATTTGCTTA 6240 

GTTCTCTTGC TTAGAGATAC CACCTTTCCC 6300 

GATGGTTTCT CCTCGCCTGG CCACTCAGCC 6360 

ACTTGAAGAA GGAAAAACAG GGGGCATGGT 6420 

CTCCCCCACT CACAGTGACC CGGAATCCCT 648 0 

CGAAAGGGCC TCGTGAT AC G CCTATTTTTA 6540 

TAGACGTCAG GTGGCACTTT TCGGGGAAAT 6600 

TAAATACATT CAAATATGTA TCCGCTCATG 6660 

TATTGAAAAA GGAAGAGTAT GAGTATTCAA 672 0 

GCGGCATTTT GCCTTCCTGT TTTTGCTCAC 6780 

GAAGATCAGT TGGGTGCACG AGTGGGTTAC 6840 

CTTGAGAGTT TTCGCCCCGA AGAACGTTTT 6900 

TGTGGCGCGG TATTATCCCG TATTGACGCC 6960 

TATTCTCAGA ATGACTTGGT TGAGTACTCA 7020 

ATGACAGTAA GAGAATTATG CAGTGCTGCC 7080 

TTACTTCTGA CAACGATCGG AGGACCGAAG 7140 

GATCATGTAA CTCGCCTTGA TCGTTGGGAA 72 0 0 

GAGC GTGACA CCACGATGCC TGTAGCAATG 726 0 

GAACTACTTA CTCTAGCTTC CCGGCAACAA 73 2 0 

GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 738 0 

GCCGGTGAGC GTGGGTCTCG CGGTATCATT 7440 

CGTATCGTAG TTATCTACAC GACGGGGAGT 750 0 

ATC GCTGAGA TAGGTGCCTC AC TG ATTAAG 75 60 

TATATACTTT AGATTGATTT AAAACTTCAT 7 620 

CTTTTTGATA ATCTCATGAC CAAAATCCCT 768 0 

GACCCCGCAG AAAAGATCAA AGGATCTTCT 7740 

TGCTTGCAAA CAAAAAAACC ACCGCTACCA 7800 

CCAACTCTTT TTCCGAAGGT AACTGGCTTC 78 60 

CTAGTGTAGC CGTAGTTAGG CCACCACTTC 7920 

GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 7980 

TTGGACTCAA GAC GATAGTT ACC GGATAAG 8040 

TGCACACAGC CCAGCTTGGA GCGAACGACC 8100 

CTATGAGAAA GCGCCACGCT TCCCGAAGGG 8160 

AGGGTCGGAA CAGGAGAGCG CACGAGGGAG 822 0 

AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 8280 

GGGCGGAGCC TATGGAAAAA CGCCAGCAAC 8340 

CGGACGCGAT GGATATGTTC TGCCAAGGGT 8400 

ATTGATTGGC TCCAATTCTT GGAGTGGTGA 8460 
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ATCCGTTAGC GAGGTGCCGC CGGCTTCCAT TCAGGTCGAG GTGGCCCGGC TCCATGCACC 8520 

GCGACGCAAC GCGGGGAGGC AGACAAGGTA TAGGGCGGCG CCTACAATCC ATGCCAACCC 8580 

GTTCCATGTG CTCGCCGAGG CGGCATAAAT CCCCGTGACG ATCAGCGGTC CAATGATCGA 8640 

AGTTAGGCTG GTAAGAGCCG CGAGCGATCC TTGAAGCTGT CCCTGATGGT CGTCATCTAC 8700 

CTGCCTGGAC AGCATGGCCT GCAACGCGGG CATCCCGATG CCGCCGGAAG CGAGAAGAAT 8760 

CATAATGGGG AAGGCCATCC AGCCTCGCGT CGGGGAGCTT TTTGCAAAAG CCTAGGCCTC 8820 

CAAAAAAGCC TCCTCACTAC TTCTGGAATA GCTCAGAGGC CGAGGCGGCC TCGGCCTCTG 8880 
CATAAATAAA AAAAATTAGT CAGCCATG 8908 
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